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NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS GROUPS A & B 
All documents cited hetein ate incoiporated by lefetence in their entirety. 
TECHNICAL FIELD 

This mvention relates to nucleic add and protems from the bacteria Streptococcus agalactiae (GBS) and 
Streptococcus pyogenes (GAS). 

BACKGROUND ART 

Once thought to infect only cows, the Gram-positive bacterium Streptococcus agalactiae (or "group B 
streptococcus", abbreviated to **GBS") is now known to cause serious disease, bacteremia and 
meningitis, in immunocompromised individuals and in neonates. There are two types of neonatal 
mfection. The first (early onset, usually within 5 days of birth) is manifested by bacteremia and 
pneumonia. It is contracted vertically as a baby passes through the birth canal GBS colonises the vagina 
of about 25% of young women, and approximately 1% of infants bom via a vaginal birth to colonised 
mothers will become infected Mortality is between 50-70%. The second is a meningitis that occurs 10 to 
60 days after birth. If pregnant women are vaccinated wilJi type III c^sule so that the infants are 
passively immunised, the incidence of the late onset meningitis is reduced but is not entirely eliminated 

The "B" in "GBS" refers to the Lancefield classification, which is based on the antigenicity of a 
carbohydrate which is soluble in dilute acid and called the C carbohydiBte. Lancefield identified 13 types 
of C carbohydrate, designated A to O, that could be serologically dijSerentiated The organisms that 
most commonly infect humans are found in groups A, B, D, and G. Within group B, strains can be 
divided mto 8 serotypes (la, lb, la/c, H, HI, IV, V, and VI) based on the structure of flieir 
polysaccharide capsule. 

Group A streptococcus ("GAS", S.pyogenes) is a firequent human pathogen, estimated to be present m 
between 5-15% of normal individuals without signs of disease. When host defences ai« compromised, 
or when the organism is able to ex^t its virulence, or when it is introduced to vulnerable tissues or hosts, 
however, an acute infection occurs. Diseases include puerperal fever, scariet fever, erysipelas, 
pharyngitis, impetigo^ necrotising fisciitis, myositis and streptococcal toxic shock syndrome. 

S.pyogenes is typically treated using antibiotics. Although S.agalactiae is inhibited by antibiotics, 
however, it is not killed by praidllin as easily as GAS. Prophylactic vaccination is thus preferable. 

Current GBS vaccines are based on polysaccharide antigens, although these suffer from poor 
immunogenidty. Antiridiotypic approaches have also been used W099/54457). ITiere remains a 
need, however, for effective adult vaccines against S.agalactiae infection. There also remains a need for 
vaccines against S.pyogenes infection. 

It is an object of the mvention to provide proteins which can be used in the development of such 
vaccines. The proteins may also be usefiil for diagnostic purposes, and as targets for antibiotics. 
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DISCLOSURE OF THE INVENTION 

The invention provides proteins coiiq)rising the S.agalactiae amino acid sequences disclosed in the 
exanoples, and proteins comprising the S,pyogenes amino acid sequences disclosed in the examples. 
These amino add sequences are the even SEQ IDs between 1 and 10960. 

5 It also provides proteins comprising amino acid sequences having sequence identity to the SMgalactiae 
amino acid sequences disclosed in tibe exan^jles, and proteins comprising amino acid sequences having 
sequence identity to the S,pyogenes amino acid sequences disclosed in the examples. Depending on the 
particular sequence, the degree of sequence identity is preferably greater than 50% <^,g. 60%, 70%, 
80%, 90%, 95%, 99% or mote). These proteins include homologs, orthologs, allelic variants and 
10 fimctianal mutants. Typically, 50% identity or more between two proteins is considered to be an 
indication of functional equivalence. Identity between proteins is preferably determined by &e 
Smith-Waterman homology search algorithm as nnplementsed in the MPSRCH program (Oxford 
Molecular), using an affine g^ search with parameters gap open penalty^l2 and gap extension 
penalty^]. 

15 Preferred proteins of the invention are GBSl to GBS689 (see Table IV). 

The invention further provides proteins comprising fragments of the SMgalactiae amino acid sequences 
disclosed in the examples, and proteins comprising fragments of the S.pyogenes amino acid sequences 
disclosed in the examples. The fragments should comprise at least n consecutive amino acids from the 
sequences and, depending on the particular sequence, n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 30, 
20 40, 50, 60, 70, 80, 90, 100 or more). Preferably the fragments comprise one or more epitopes from the 
sequence. Oflier prefcned fragments are (a) the NPtemrinal signal peptides of the proteins disclosed in 
the examples, (b) the proteins disclosed in flie examples, but without their N-terminal signal peptides, (c) 
fragments common to the related GAS and GBS proteins disclosed in the exaniples, and (d) the proteins 
disclosed in the exaii5)les, but without their N-terminal amino acid residue. 

25 The proteins of the invention can, of course, be prepared by various means {e,g, recombinant 
e3q)ression, purification from GAS or GBS, chemical synthesis etc) and in various forms i^.g. native, 
fusions, glycosylated, non-gtycosylated etc). Thsy are prefembly prepared in substantially pure form 
(/.e. substantially free from other streptococcal or host cell proteins) or substantially isolated form. 
Protons of the invention are preferably streptococcal proteins. 

30 According to a fiirther aspect, the invention provides antibodies which bind to these proteins. These 
may be polyclonal or monoclonal and may be produced by any suitable means ^g. by recombinant 
e?cpression). To increase coii?)atibility with Ihe human immune system, the antibodies may be chimeric 
or humanised (e.g. Breedveld (2000) Lancet 355(9205):735-740; Gorman & dark (1990) Semin, 
Immunol 2:457-466), or fiilly human antibodies may be used. The antibodies may inchide a detectable 

35 laibel (e.g. for diagnostic assays). 
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According to a further aspect, the invention provides nucleic acid comprising the SMgalactiae 
nucleotide sequences disclosed in the examples, and nucleic acid comprising the S.pyogenes nucleotide 
sequences disclosed in the examples. These nucleic acid sequences are the odd SEQ IDs between 1 and 
10966. 

5 In addition, the invention provides nucleic add comprising nucleotide sequences having sequence 
identity to the S.agalactiae nucleotide sequences disclosed in the examples, and nucleic acid conq)rising 
nucleotide sequences having sequence identity to the S.pyogenes nucleotide sequences disclosed in the 
examples. Identity between sequences is preferably determined by the Smith-Waterman homology 
search algorithm as desoibed above. 

10 Furthermore, the invention provides nucleic acid which can hybridise to the S.agalactiae nucleic acid 
disclosed in the examples, and nucleic acid which can hybridise to the S.pyogenes nucleic acid disclosed 
in the examples preferably under 'high stringency' conditions (e.g. 65°C m O.lxSSC, 0.5% SDS 
solution). 

Nucleic acid comprising j&agments of these sequences are also provided. These should comprise at least 
15 n consecutive nucleotides from the S.agalactiae or S.pyogenes sequences and, depending on the 
particular sequence, /i is 10 or more (e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 
200 or more). The ftagments may comprise sequences which are common to the related GAS and GBS 
sequences disclosed in the examples. 

According to a further aspect, the invention provides nucleic acid encoding the proteins and protein 
20 ftagments of the invention. 

The invention also provides: nucleic add comprising nucleotide sequence SEQ ID 10967; nucleic acid 
comprising nucleotide sequences having sequence identity to SEQ ID 10967; nucleic acid which can 
hybridise to SEQ ED 10967 (preferably under 'high stringency' conditions); nucleic acid coirq)rising a 
fragment of at least n consecutive nucleotides from SEQ ID 10967, wherein n is 10 or more e.g. 12, 14, 
25 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 
900, 1000, 1500, 2000, 3000, 4000, 5000, 10000, 100000, 1000000 or more 

Nucleic acids of the invention can be used in hybridisation reactions (e.g. Northern or Southern blots, or 
in nucleic acid microarrays or 'gene chips') and amplification reactions (e^g. PGR, SDA, SSSR, LCR, 
TMA, NASBA etc.) and other nucleic acid techniques. 

30 It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (e.g. for antisense or probing, or for xjsg as primers). 

Nucleic acid according to the invention can, of course, be prepared in many ways ^.g. by chemical 
synthesis, from genomic or cDNA libraries, from the organism itself eta) and can take various forms 
(e.g. single stranded, double stranded, vectors, primers, probes, labelled efc.). The nucleic add is 
35 preferably in substantially isolated fonn. 
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Milcleic acid according to the invention may be labelled e,g. with a radioactive or fluorescent label. This 
is particularly useful where the nucleic acid is to be used in nucleic acid detection techniques e.g, where 
the nucleic add is a primer or as a probe for use in techniques such as PGR, LCR, TMA, NASB A eta 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as those 
5 containing modified backbones, and also peptide nucleic acids (PNA) etc. 

According to a further aspect, the invention provides vectors comprising nucleotide sequences of the 
invention (e.g. cloning or expression vectors) and host cells transformed with such vectors. 

According to a further aspect, the invention provides compositions conq)rising protein, antibody, and/or 
nucleic acid according to the invention. These con5)ositions may be suitable as immunogenic 
10 compositions, for instance, or as diagnostic reagents, or as vaccines. 

The invention also provides nucleic acid, protein, or antibody according to the mvention for use as 
medicaments ^.g. as immunogenic conopositions or as vacdnes) or as diagnostic reagents. It also 
provides the use of nucleic acid, protein, or antibody according to the invention in the manufecture of: (i) 
a medicament for treatmg or preventing disease and/or infection caused by streptococcus; (ii) a 
15 diagnostic reagent for detecting the presence of streptococcus or of antibodies raised against 
streptococcus; and/or (iii) a reagent which can raise antibodies against streptococcus. Said 
streptococcus may be any species, group or strain, but is preferably S.agalactiae, especially serotype 
m or V, or S.pyogenes. Said disease may be bacteremia, meningitis, puerperal fever, scarlet fever, 
erysipelas, pharyngitis, impetigo, necrotising fasciitis, myositis or toxic shock syndrome, 

20 The invention also provides a method of treating a patient, comprising administering to the patient a 
therapeutically effective amount of nucleic acid, protein, and^or antibody of the invention. The patient 
may either be at risk &om tiie disease themselves or may be a pregnant woman ('maternal immunisation' 
e.g Glezen & Alpers (1999) Clin. Infect. Dis. 28:219-224). 

Administration of protein antigens is a preferred method of treatment for inducing immunity. 

25 Administration of antibodies of flie invention is another preferred method of treatment This method of 
passive immunisation is particularly useful for newbom children or for pregnant women. This method 
will typically use monoclonal antibodies, which will be humanised or My human. 

The invention also provides a kit comprismg primers (e.g PGR primers) for anq)hfying a template 
sequence contained within a Streptococcus (e.g S.pyogenes or S.agalactiae) nucleic acid sequence, flie 
30 kit comprising a first primer and a second primer, wherein the first primer is substantially coniplementary 
to said template sequence and the second primer is substantially coniplementary to a complement of said 
template sequence, wherem the parts of said primers which have substantial conGplementarity define tiie 
termini of the template sequence to be amplified The first primer and/or the second primer may include 
a detectable label (e.g. a fluorescent label). 
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The invention also provides a kit comprising first and second single-stranded oligonucleotides which 
allow an5)lification of a Streptococcus template nucleic acid sequence contained in a single- or double- 
stranded nucleic add (or mixture thereof), wherein: (a) the first oligonucleotide conprises a primer 
sequence which is substantially con5)lementaiy to said template nucleic acid sequence; (b) the second 
5 oligonucleotide comprises a primCT sequence which is substantially complementary to the coii9)lement 
of said template nucleic acid sequence; (c) the first oligonucleotide and/or the second oligonucleotide 
conq)rise(s) sequence which is not con^pementary to said template nucleic acid; and (d) said primer 
sequences define the termini of the tenq>late sequence to be amplified The non-con^lementaty 
sequence(s) of feature (c) are preferably upstream of (^.e. 5' to) the primer sequences. One or both of 
10 these (c) sequences may conprise a restriction site ^.g. EP-B-0509612) or a promoter sequence (e.g: 
EP-B-0505012). The first oligonucleotide and/or the second oligonucleotide may include a ^tectable 
label ((e.g. a fluorescent label). 

The template sequence may be any part of a genome sequence ^.g. SEQ ID 10967). For exaiiq)le, it 
could be a rRNA gene (e.g. Turenne et al. (2000) J. Clin. Microbiol 38:513-520; SEQ IDs 12018-12024 
15 herein) or a protein-coding gene. The template sequence is preferably specific to GBS. 

The invention also provides a computer-readable medium (e.g. a floppy disk, a hard disk, a CD-ROM, a 
DVD etc.) and/or a coniputer database containing one or more of the sequences in the sequence listing. 
The medium preferably contains SEQ ID 10967. 

The invention also provides a hybrid protein represented by the formula NH2-A-[-X-L-],rB-C00H, 

20 wherein X is a protein of the invention, L is an optional linker amino acid sequence, A is an optional 
N-terminal amino acid sequence, B is an optional Gterminal amino acid sequence, and « is an integer 
greater than 1. The value of /i is between 2 and x, and the value of x is typically 3, 4, 5, 6, 7, 8, 9 or 10. 
Ptefisrably » is 2, 3 or 4; it is more preferably 2 or 3; most preferably, n = 2. For each n instances, -X- 
may be the same or different For each n instances of [-X-L-], linker amino acid sequence -L- may be 

25 present or absent For instance, when 7i=2 the hybrid may be NHa-Xi-LpXa-Lz-COOH, NHj-Xj-Xj- 
COOH, NHj-XrLrXa-COOH, NHz-XrXr-La-COOH, etc. Linker amino acid sequence(s) -1^ will 
typically be short (e.g. 20 or fewer amino acids le. 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 
3, 2, 1). Exanaples include short peptide sequences which facilitate cloning, poly-glydne linkers (i.e. Gly„ 
where n = 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags BBs„ M*iere n = 3, 4, 5, 6, 7, 8, 9, 10 

30 or more). Other suitable Unker amino acid sequences will be parent to those skilled m the art -A- and - 
B- are optional sequences which will typically be short (e.g. 40 or fewer amino acids Le 39, 38, 37, 36, 
35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 
7, 6, 5, 4, 3, 2, 1). Examples include leader sequences to direct protein trafificking, or short peptide 
sequences which facilitate cloning or purification (e.g. histidine tags i.e, His„ where n = 3, 4, 5, 6, 7, 8, 9, 

35 10 or more). Other suitable N-temiinal and C-terminal amino acid sequences will be j^jparent to those 
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skilled in the art. In some embodiments, each X will be a GBS sequence; in others, mixtuxes of GAS and 
GBS will be used. 

According to further aspects, the invention provides various processes. 

A process for producing protems of the mvention is provided, coinprising the step of culturing a host 
5 cell of to the invention under conditions which induce protem expression. 

A process for producing protein or nucleic acid of the invention is provided, wherein tiie protein or 
nucleic acid is synthesised in part or in whole using chemical means. 

A process for detecting polynucleotides of the invention is provided, comprismg the steps of: (a) 
contactmg a nucleic probe according to the invention with a biological sample under hybridising 
10 conditions to form duplexes; and (b) detecting said duplexes. 

A process for detecting Streptococcus in a biological sample (e.g. blood) is also provided, comprisiag 
the step of contacting nucleic acid according to the invention with the biological san^le under 
hybridising conditions. The process may involve nucleic acid amplification ^.g. PGR, SDA, SSSR, 
LCR, TMA, NASBA etc.) or hybridisation {e.g. microarrays, blots, hybridisation with a probe in 
15 solution etc.). PGR detection of Streptococcus in clinical samples, in particular S.pyogenes^ has been 
reported [see e.g. Louie et al (2000) CMAJ 163:301-309; Louie et al (1998) J. Clin. Microbiol 
36:1769-1771]. Clinical assays based on nucleic add are described in general in Tang et al (1997) Clin. 
Chem. 43:2021-2038. 

A process for detecting proteins of the invention is provided, con5)rising the steps of: (a) contacting an 
20 antibody of the invention with a biological sample under conditions suitable for the formation of an 
antibody-antigen complexes; and (b) detecting said complexes. 

A process for identifying an amino acid sequence is provided, comprising the step of searching for 
putative open reading frames or protein-coding regions within a genome sequence of S.agalactiae. This 
will typically involve in silico searching the sequence for an initiation codon and for an in-frame 

25 temnination codon in the downstream sequence. The region between these initiation and termination 
codons is a putative protein-coding sequence. Typically, all six possible reading frames will be searched. 
Suitable software for such analysis inchides ORFFINDER (NCBI), GENEMARK [Borodovsky & 
Mclhindi(1993) Computers Chem, 17:122-133), GLIMMER [Salzheig et al (1998) Nucleic Acids Res. 
26:544-548; Salzbeig et al. (1999) Genomics 59:24-31; Delcher et al (1999) Nucleic Acids Res. 27:4636- 

30 4641], or oflier software which uses Markov models [e.g. Shmafkov et al. (1999) Bioinformatics 
15:874-876]. The invention also provides a protein comprising the identified amino add sequence. These 
proteins can then expressed usmg conventional techniques. 

The invention also provides a process for determining whether a test compound bmds to a protein of the 
invention. If a test compound binds to apiotein of the invention and this binding inhibits the life cycle of 
35 the GBS bacterium, then tiie test compound can be used as an antibiotic or as a lead conqwund for the 
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design of antibiotics. The process will typically comprise the steps of contacting a test compound with a 
protein of the invention, and determining whether the test compound hinds to said protein. Preferred 
proteins of the invention for use in these processes are enzymes ^,g, fRNA synflietases), membrane 
transporters and ribosomal proteins. Suitable test compounds include proteins, polypeptides, 

5 carbohydrates, lipids, nucleic acids ^.g. DNA, RNA, and modified forms thereof), as well as small 
organic compounds ^,g, MW between 200 and 2000 Da). The test comypounds may be provided 
individually, but will typically be part of a library (e.g. a combinatorial library). Methods for detecting a 
binding interaction include NMR, filter-binding assays, gel-retaidation assays, displacement assays, 
surface plasmon resonance, reverse two-hybrid etc. A compound which binds to a protein of the 

10 invention can be tested for antibiotic activity by contacting the compound with GBS bacteria and then 
monitoring for inhibition of growth. The inv^tion also provides a compound identified using these 
methods. 

The invention also provides a con:q)osition comprising a protein or the invention and one or more of the 
following antigens: 

15 - a protein antigen fiom Helicobacter pylori such as VacA, CagA, NAP, HopX, HopY [e,g, 
WO98/04702] and/or urease. 

- a protein antigen fiom Kmeningitidis serogroup B, such as those in W099/24578, W099/36544, 
WO99/57280, WOOO/22430, Tetolin et al (2000) Science 287:1809-1815, Pizza et al (2000) 
Science 287:1816-1820 and W096^9412, with protem '287' and derivatives being particularly 

20 preferred. 

- an outer-membrane vesicle (OMV) preparation fiom Kmeningitidis serogroup B, such as those 
disclosed in WOOl/52885; Bjune et al. (1991) Lancet 338(8775):1093-1096; Fukasawa et al (1999) 
Vaccine 17:2951-2958; Rosenqyist et al (1998) Dev. Biol Stand. 92:323-333 etc. 

- a saccharide antigen from Kmeningitidis serogroup A, C, W135 and/or Y, such as the 
25 oligosaccharide disclosed in Costantino et al. (1992) Vaccine 10:691-698fiom serogroup C [see 

also Costantino etal (1999) Vaccine 17:1251-1263]. 

- a saccharide antigen fixxm Streptococcus pneumoniae [e.g. Watson (2000) Pediatr Infect Dis J 
19:331-332; Rubin (2000) Pediatr Clin North Am 47:269-285, v; Jedrzejas (2001) Microbiol Mol 
5irfi?ev 65:187-207]. 

30 - an antigpn jftom hepatitis A virus, such as inactivated virus ^.g. Bell (2000) Pediatr Infect Dis J 
19:1187-1188; Iwarson (1995) ^MS 103:321-326]. 

- an antigen &om hepatitis B virus, such as the surface and/or core antigens [e.g. Gerlidi et al (1990) 
Vaccine 8 Siq)pl:S63-68 & 79-80]. 

- an antigen fiom hqpatitis C virus [e.g. Hsu et al (1999) Clin Liver Dis 3:901-915]. 

35 - an antigen fiom Bordetella pertussis, such as pertussis holotoxin (PT) and filamentous 
ha^[nagglutinin (FHA) fiom Bpertussis, optionally also in cornbination with pertactin and/or 
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agglutinogens 2 and 3 [e.g, Gustafsson et al (1996) K Engl J. Med. 334:349-355; Rappuoli et ah 
(1991) TJBIECil 9:232-238]. 

- a diphtheria antigen, such as a diphtiieria toxoid [e.g. chapter 3 of Vaccines (1988) eds. Plotkin Sc 
Mortimer. ISBN 0-7216-1946-0] e.g. the CRM197 mutant [e.g. Del Guidice et al (1998) Molecular 

5 Aspects of Medicine 19: 1-70]. 

- a tetanus antigen, such as a tetanus toxoid [e.g. chapter 4 of Plotldn & Mortimer], 

- a saccharide antigen from Haemophilus influenzae B. 

- an antigen from Kgonorrhoeae [e.g, W099/24578, W099/36544, WO99/57280]. 

- an antigen from Chlamydia pneumoniae [e.g PCr/DBOl/01445; Kalman et al (1999) Nature 
10 Genetics 21:385-389; Read et al (2000) Nucleic Acids Res 28:1397-406; Shirai et al. (2000) J. 

Infect Dis. 181(Suppl 3):S524-S527; WO99/27105; WOOO/27994; WOOO/37494]. 

- an aatig^ from Chlamydia trachomatis [e.g. W099y28475]. 

- an antigen from Porphyromonas gingivalis [e.g Ross et al, (2001) Vaccine 19:4135-4142]. 

- polio antigen(s) [e.g Sutter et al (2000) Pediatr Clin North Am 47:287-308; Zmraierman & Spann 
15 {\999) Am Fam Physician 59M3-m, 125-126] such as ffV or OPV. 

- rabies antigen(s) [e.g. Drees^ (1997) Vaccine 15 Si5)pl:S2-6] such as lyophilised inactivated virus 
[e.g. MMWR Morb Mortal Wkty Rep 1998 Jan 16;47(1):12, 19; RabAvert^M], 

- measles, mumps and/or rabella antigens [e.g. chapters 9, 10 & 1 1 of Plotkin & Mortimer]. 

- influenza antigen(s) [e,g. chapter 19 of Plotldn & Mortimer], such as the haemagglutinin and/or 
20 neuraminidase sur&ce protems. 

- an antigen from Moraxella catarrhalis [e.g. McMichael (2000) Vaccine 19 Susppl 1 :S1 01-107], 

- an antigen from Staphylococcus aureus [e.g. Kuroda etal (2001) Lancet 357(9264): 1225-1240; 
see also pages 1218-1219]. 

Where a saccharide or carbohydrate antigen is included, it is preferably conjugated to a carrier protein m 
25 order to enhance hnmunogaiicity [eg, Ramsay et al. (2001) Lancet 357(9251):195-196; Lindbeig (1999) 
Vaccine 17 Suppl 2:S28-36; Conjugate Vaccines (eds. Cruse et al,) ISBN 3805549326, particularly vol. 
10:48-114 etc\. Preferred carrier proteins are bacterial toxms or toxoids, such as diphtheria or tetanus 
toxoids. The CRM197 diphtheria toxoid is particularly preferred Other suitable carrier proteins include 
tiie Nmeningitidis outer membrane protein [e.g. EP-0372501], synthetic peptides [e,g. EP-0378881, EP- 
30 0427347], heat shock proteins [e.g. W093/17712], pertussis protems [eg W098/58668; EP-0471 177], 
protein D from H.influenzae [eg WOOO/56360], toxm A or B from Cdifficile [e.g WOOO/61761], eta 
Any suitable conjugation reaction can be used, with any suitable linker where necessary. 

Toxic protein antigens may be detoxified where necessary {e.g. detoxification of pertussis toxin by 
chemical and/or genetic means). 
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Where a diphtheria antigen is included in the composition it is preferred also to include tetanus antigen 
and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to include 
diphthetia and pertussis antigens. Similarly, where a pertussis antigen is included it is preferred also to 
inchide diphtheria and tetanus antigens. 

5 Antigens are preferably adsorbed to an aluminium salt. 

Antigens in the coniposition will typically be present at a concentration of at least l^gAnl each. In 
general, the concentradon of any given antigen will be sufficient to elicit an immune response against that 
antigen. 

The invention also provides compositions conq)rising two or more proteins of the present inventioiL 
10 The two or more protems may comprise GBS sequences or may comprise GAS and GBS sequences. 

A summary of standard techniques and procedures which may be employed to perform the inv^tion 
(e.g. to utilise the disclosed sequences for vaccination or diagnostic purposes) foUows. This summary is 
not a limitation on the invention but, rather, gives examples that may be used, but are not required 

General 

15 The practice of the present invention will eiriploy, unless otherwise indicated, conventional techniques of 
molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the 
art Such techniques are esqplained fidly in the literature eg. Sanabrook Afo/ecM/or Cloning; A Laboratory 
Manual Second Edition (1989); DNA Cloning, Volumes I and II (DH Glover ed. 1985); 
Oligonucleotide Synthesis (MJ. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. 

20 Higgins eds. 1984); Transcription and Translation (B.D. Hames & S,J. Higgms eds. 1984); Animal 
Cell Culture (R.L Freshney ed. 1986); Immobilized Cells andEnzyjnes (IRL Press, 1986); B. Perbal, A 
Practical Guide to Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, 
Inc.), especially volumes 154 & 155; Gene Transfer Vectors for Mammalian Cells (J.H, Miller and M.P. 
Calos eds, 1987, Cold Spring HaAor Laboratory); Mayer and Walker, eds. (1987), Immunochemical 

25 Methods in Cell and Molecular Biology (Academic Press, London); Scopes, (1987) Protein 
Purification: Principles and Practice^ Second Edition (Springer-Verlag, N.Y.), and Handbook of 
Experimental Immunology, Volumes I-IV (D.M. Weir and C. C. Blackwell eds 1986). 

Standard abbreviations for nucleotides and amino acids are used in this specification. 

Definitions 

30 Aconp)siti(mcoiilainmgXis"substan^ 

Preferably, X conpises at least about 90% by weight of flie total of X+Y in Ihe compositioD, more preferably at least about 95% 
QrevenS^3%bywei^ 

The tmn"conpisii^' means 'MudiD^ acx)U|K)sifi(Hi"oo93pian^ 
X or may irdude somdhiiig additi^ 

35 The term 'lieterologous" refers to two hiolpgical conjxmits that are not found togeflier in nature. The coapaieiits may be host 
ceDs, <»: r^iulaloiy r^kais, 

fliey can function tpgeiher, as v/bm a promoter heterdogous to a gene is cperabfy linked to the gene. Ano&or exanple is wheie a 
sttqjtococcus sequence is heterologous to a mouse host cdL A j&iither ©mn5)les would be two cjatopes Sxm 4e same or 
dffeisii ptoteiDS "Vf^odi have bee^ 
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An ''oiig^ of iqpUcation" is a pd^cleodde sequence Ibat initiates and regulates i^lication of polynudeotides, such as an 
e?pession vector. The odgin of K^catkm behaves as an autcHiotnous unit of polyoudeotide ie(dicaticii wilfain a oell» capaih oT 
rq)licatiQn under its own cootroL An origin of iqJication may be needed for a vector to lepKcate in a particular hsy^ eel Wifli 
certain origins of repHcatioi], an espessicm vector can be rqstoduced at a high copy numb^ in the |»ssence of the ^popriate 
5 psDoleinsv^^hintec^Examid^ 
T-aot^ eiOfective in 00S^7 cdls. 

A '^rautanf' sequaice is defined as DNA, RNA (m* amino add sequ^ice dffedng fiom but having sequence idaotity with the 
native or disclosed sequence. Dq)eDding on flie particular sequence, die degree of sequ^ce identity between the native or 
disclosed sequence and the mutant sequence is {ffiefoably gieater flian 50% 60%, 70%, 80%, 90%, 95%, 99% or ukhc, 
10 calculated nsicg fee Smifh-Watennan algorithm as described above). As used herein, an "allelic variant" of a nucleic add 
molecule^ or legioD, for whidi nucldc add sequence is provided hi^in is a nucldc add molecule, or region, that occurs essoitially 
at the sanoB locos in file genome of aiK)te or second isol^ 

or recombinatia], has a similar but iKst identical nucldc add sequence. A coding region allelic variant typically encodes a protdn 
having similar activiiy to &at of Ihe pDotein encoded by Ihe g^ to which it is being compared An allelic variant can also 
15 compiiseanallaidioninthe5' orS'unbansIa^ seeUSpatent 
5,753,235). 

Expression systems 

The sirqjtoooccus nudeotide sequences can be e?q)ressed in a variety of diflferent expression systems; for example those used 
with inanimalian odls^ bacutevimses, idan^ 

20 rManirnftlian Systems 

MammaKan eypesaon systems are known in Ihe art A mammalian pRxnoter is any DNA sequaice cjpble of binding 
mammalian RNA pdymerase and initiating the dovt^ostieam (3) ttaoscr^on of a coding seqpience ^. stmctuial gene) into 
niRNA. A pRfflootm- will have a tianscDpli(xi imtiating r^on, wUdi is usually i^aced paxBdmal to &e 5* end of fte codir^ 
sequeMce, and a TATA box, usually located 25-30 base pairs (bp) upstream of the Iranscription initiatioa site. The TATA box is 
25 thcxi^ to direct RNA polynomse n to b^ RNA s^mthesis at the correct site. A mammalian pramot^ will also contain an 
iq)streamptonaaterdeniBnt, usually located wi^ 100 to200bpq)stieamof1faBTATAbax. Annpstreampnmxtod^^ 
detamines fte rate at which transcription is initiated and can act in diher ori^Mon [Samlmdc et aL (19^) "E?qpression of 
Cloned Genes in Mammalian Cdk" InMolecular Qoning: A Laboratory Manual 2nd edj. 

Ivtensalian viral g^nes ace oilen highly expressed and have a broad host range; th^fore sequences encoding noammalian viral 
30 genes provide paiticulariyusefiil promoter sequences. Bmceplss include the SV40 eariyproniotEr, nKJUsemanraiaty tumor virus 
LTRpromotia; adenovirus maj(»- late pronnDt^ (Ad and h^ipes simplex virus ponoter. hi addition, sequences derived 
fimi non-viral genes, sudi as the niurine metaUo^o^ 

constitutive crt^ulaled Qndudble), depeaoding on Ihe promoter cante induced wi&^uooooEtiooid in homn^ cdk 

The presence of an oihancer element (enhancer), combined with the pramoter elements described above, will usuaify increase 
35 e>q3iession levels. An enhancer is a regulatory DNA sequence that can stimulate transoipdon \jp to 1000-fold when linked to 
homdogous or heterologous pmm^^ 

they are placed iq)stream (x downstream fiom &e transoi^on initiation ^ in dflier normal or Eppod ori^lation, or at a 
distance of mcae than 1000 nucleotides fixim tibe pomota: [Maniatis et aL (1987) Science 236.1237; Alberts et aL (1989) 
Molecular Biology cfUie Cell, 2nd ed.]. Erihanoor dements d»ived fiom viruses may be particdady us^ because hsy 
40 nsudlyhaveabroadorhostrangp. ExanplesinchidefheSV40e 4:16\]mi 
ibsi enhancec^pramoters derived fiom fee long terminal xspeai (LTR) of fee Rous Sarcoma Vvm [Gorman et aL (1982b) Proa 
Natl Acad. ScL 79:67771 and fiom faunoan (^omi^lovinis [Bosbrt et aL (1985) CeU 41:521], Additionally, some enhancers 
are r^ulatable and become active only infee presetice of an induce, sudhas ahomone or metal ion [Sasson&€ocd andB(»dli 
(1986) Trends Genet. 2:215; Maniatis etal. (1987) 236:1237]. 

45 A DNA molecule may be ra^iressed intracdhilariy in mammalian cdk A promoter sequence may be direcdy tinted wife the 
DNA molecule, in whidi case flie first amino acid at the N4mninus of fee record)inant protan will ahv^ be a methimne, vMdx 
is encoded by fee ATG start codoa If desired, the N-tenninus may be deaved fimi fee protdn by in vitro incubation wife 
cyanogen bromide. 

Altemativefy, foi)dgn pmtdns can also be secreted fiom fee odl 
50 encode a fi:eionim}teinanq)dsri 
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cells. Preferably, there are processing sites encoded between the lead^ fiagnieDt and the foreign lhat can be cleaved either 
in vm? or in ^ntro. The leader sequence fiagnooot iisualty oocodes a signal pqjilide conpised of hydrophobic atmno adds wUcfa 
direct fte seca:elion of the protein fipm the 
fcff secceticm of a fcraga proteia in manm^ 

5 Usualfy) transcqjdon termination and polyadei^latioQ sequences recognized hy mammalian cdls are regulatory regions located 3' 
to file translation stop codon and fhus^ togeS)er wifhtibe promoter elemoiis, flank the coding sequexice. The 3' tmmus of the 
nature mRNA is fomaed by site-qpecific post-transciq)tional cleavage and polyadenylation [Bimstiel et aL (1985) Cell 41:349; 
Proudfoot and Whitelaw (1988) Termination and 3' end processing of eukaiyotic RNA. In Transcription and splicing (ed 
BD. Hanoes andDM. Glover); Proudfoot (1989) Trends Biodient Sd. 14:10S\, These sequences direct the transcriptiai of an 

1 0 mRNA which can be tramlated into flie polypeptide encoded by flie DNA. Exan^ies of transcriptitxi temiinaten^lyadenylation 
signals include those daived fiom SV40 [Samhrook et al (1989) "Expression of cloned genes in cultured mammalian ceDs." Jr 
Molecular Cloning: A Laboratory Manual\. 

Usually, the above described ccxnponents, conpi^ a promoter, polyadenylation signal, and transoiption terminatim sequence 
are put tpgdher into espessioii constructs. Enharx^ers, introns wilfa fimctional q^tioe donor and acceptor sites, and leader 

IS sequences may also be included in an expression coostmd, if desired E^res^on coiislmcts are oflen maintained in a r^licon, 
sudi as an extradiramoson^ client jdasmids) c^ble of stable maintenance in a host, sudi as mammalian cells or 
bacteda. MamnBlian r^cadon i^stms include those derived fiom animal viruses^ wMdi require tran&-acting Actors to replicate. 
For ©Ktmple, plasmids conlaining flie replication systems of pj^wvaviruses, sudi as SV40 [Glnzman (1981) Cell 25:175] or 
pdyomaviTus, rq)licate to extremely hi^ cqpy numba: in flie piBsence of tiie appropriate viral T antigen Additional ©canples of 

20 nrwmmalian iqdkons include those derived fiom bovine pEpUomavinis and Epstetn-Banr vims. Additionally, the iq>licQn may 
Icive two rep^^atOQ systmis, ftus dbwing it to be maii]teined, £)r example, in mammalian cells for e?qxession and in a 
prdcsoyotic host for clcming and anpMcatKHt Examples of su(^ [Kaufinanet 
al. (1989) MoZ. Cell. Biol P:946] andpHEBO [ShimizuetaL (1986)Afoi Cell. Biol. (5:1074]. 

The transfonnation procedure used depoids upon the host to be transformed Methods for introduction of h^a^ologpus 
25 pdlynudeotides into mammalian oelb are known in fte ait and include dextranmediated transfectkm, caldum pk)^[d)ate 
predpitalion, pclybiBne mediated transfection, protoplast Mon, electK^KMatira, aicq)sulatkMi of fte polynucleotidB(s) in 
lqx)6omes, and direct nmoiqection of the DNA into nuclei 

MammaHan ceU lines available as hosts £)r opession ^ 

the American Type Culture CdDectioo (AT(X), including but not limited to, Chmese hamsler ovaiy (CHO) ceDs, HeLa oeHs^ 
30 baby hamster kidney ^HK) ceDs, monkey kidney cells (COS), human hepatocellular cardnoma cells ^g. Hep G2), and a 
nirnber of oOkt cell Hnes. 

ilBacukjfviros Systems 

The polynucleotide encoding the fMotem can also be inserted into a suitable insect e^qaession vector, and is (^)erably hnted to the 
control dements widiin that vect(»r. Vector constniction en[9)loys techniques wMdi are known in fiie art Genracally, the 
35 omspcmSs of the expression system incfaide a transfer vector, nsuaDy abacterial plasniid, wUdi contains botti a fiagment of ftie 
haculovirus genome, and a comment restriction site for insation of the hderologous gene or goies to be expressed; a wild type 
baculovirus with a sequence homoipgous to the bQCulovin]&-spedfic fiagnsi^ in the transfer vector (this allows for the 
homok^ous reoombinatian of fl» heterologous gpne in to &e bacubvtms genome); and ^popdate insect host cdls and growb 
media. 

40 After inserting flie DNA sequaice encoding flie protein into flie transfer vector, flie vector and the wild type viral genome are 
tiansfected into an insect tx)st ceU wkse flie vector and vkal genon^ 

is e^qxessed and lecomtxnant jdaques are idoitiiBed and purified Materials and methods ton baoflovinis^nsect cdl ^pesskm 
systenos are c(Mimadalfy available in kit fonnfioo^ dia^ Invitrogen, San Diego CA C'MaxBac" kit). These tedmiques are 
generally known to fliose skilled in fl£ art and fi% descnl)ed in Summers and Smilh, Texas Agricultural Experiment Station 

45 BidMn No. 1555 (1987) Oierdnafter ''Summets and Smifii*). 

Prior to insoting flie DNA sequmce encoding flie pioteitt into flie haculovirus genome, fliB'abovB desaibed compon^ 
comprismg a promote:, leader (if desiredO, codmg sequence, and transcription termination sequmce, are usually assenibled into an 
intetmedkite transpiacement constmct (transfer vector). This may contain a sin^e gene and cperably linked xegiMxy element^ 
mdi^le genes, eadi wifli its owned sd; d apexsSoty Hnked r^;platory ekn^nts; or muttqile genes, regulated by flie same set of 

50 legqlatoiy dbnoits. Bitermediate trani^lacemeait constnKts are often maintained in a iq[dicon, aidi as an extca-diKHnosomal 
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d&xmA (e.g. plastnids) cspstk of sfable wmkmxx in a faosi, sudi as a badedma The replicon will tiave a lepIicadoQ sysfim, 
ftas aUowii^ it to be roahdai^ 

Cunrafly, fte most comnxHily used transfer vector for introducing foidga genes into AcNPV is pAc373, Mmy o(h^ vectors, 
knovm tofliose of skQl infhe art, have also been designed. Ibese indude, for example, pVL98S (wMchaUias Ifaepofyhedna start 
5 codQafimiATGtoATT,aDdi9ttidiiab^^ 
SummfiK, miogy (1989) 17:31 

The plasmid usually also ccMitains the poljtedrin polyadaaylatiaa signal (Miller et al (l%S)AnrL Rev. Microbiol^ 42:171) and a 
piokaiyotic anpcdDin-resstance {pmp) g^ and oiigin of lepHcadon for selecdon aixi pnopagadon in E. coU. 

Baculovirus tansfer vectors usually contain a beculovims pcamotier. A bacdovinis pcomDteT is 2sss DNA sequence capoHe of 
10 binding a bacdovirusR^L\pdy^leEase and initiate sliuctural 
g^) into noiRNA A promoter will have a transcriptioa initiatiaa vsgm ^di is usually placed poximal to fte 5* end of fee 
codirg seqaeau:e. This iranscnpdon i^^ 

site. A baculovirus tcansfar vector m^ also have a secomi dcmam called an enhancer, which, if present is usuaOy distal to fee 
structural gene. E?q)rBsaon may be eilter r^;ulaied or constitutive. 
15 Structural geaies, abundant tianscnbed at late times in a viral infecdon cycle, provide particulaify useful pramoto: sec^ieoces. 
Examt^es include sequmes ddved fitxn fee gene enco^ 

of BacutoYirusGbieEx(sessi(M),"ia' The Molecular Biology ofBaculovimes (ed. Walter Doerfler); EPO PubL Nos. 127 839 
and 155 476; and fee g«ie encoding fee plO protein, Vlak et aL, (1988),/. Gen, Virol 6P:765. 

DNA encoding suitable signal sequences can be (fcrived fixsn gaies for secreted insect or bacidovims jootseins, sudi as the 
20 baculovirus poljdiedrin gene (Carbonell et al, (1988) Gewe, 75:409). Altmativdy, ance fee signals for mammalian cell 
posttiamlational modifications (sudi as signal peptide cleavage, proteolytic deavage, and phos[j](»[yialioi^ appear to be 
recognized by insect cells, and the signals required for season and nuclear accumulation also aj^pear to be coosraved between 
fee inv^hrate cells and vealebiate cells, leaders of non-insect adgin, such as those derived feom genes OKXxSng human a- 
interferon, Maeda et aL, (1985), Nature 315:592] human gastrin-ieleasmg peptide, Lebacq-Vedi^den et al., (1988), Molea 
25 Cell BioL iS:3I29; human IL-2, Smife et al, (1985) Proc. Nad Acad, ScL USA, 528404; mouse IL-3, (Miy^ima d aL, 
(1987) Gfeit^ J5r273; and human glucoceidHOsidase, Martin etaL (19^^ 759, can also be used to provide for seaeti(Mi 
in insects. 

A recombinant polypq)tide or polyprotetn may be expressed intiaceUulady or, if it is ejpessed wife the piqier regulatory 
sequences, it can be seoBted Ckxyd inliaceM^ 

30 id^Dy have a short leadar sequence containing sdtable translatioa initiation signals preceding an ATG start a^paL If desired, 
methionine at the N-temiinus m^ be deaved fiom fee mature protein by in vitro incubation wife cyanogen hromide. 

Altemativefy, recombinant pofy^Hoteins or proteins which are not naturally seaeted can be secreted jfenn the insect cell by 
opeatii^ chioasic DNA mfileades that aioode a Man protein aaipised of a leadar sequence fis^tnent feat provides fac 
secretion of the foidgn jxDtem in insects. The lead^ sequence fiagment usually aioodes a signal peptide oom|Kised of 
35 hydrophobic amino adds which diiect the translocation of fee protdn into the ^idoplasmic leticuhm 

After inseiticm of fee Dl^ sequence and/or fee g^ne €m>^ 4^ 
is co4ranrformed wife the hetm)logous DNA of the tiai^fer ve^ 

co-tiansfectioa The promoter and tianscxqpfion termination S6(pmce of the constmct will usually compjdse a 2-51d} section of fee 
baoolovinis gODonK. Mefeods fer inb^^ 
40 (See Summas and Smife supra; Ju et al. (1987); Smife €t al., MoL Cdl Biol (1983) 1*2156; and Ludcow and Summers 
(1989)). For example the ins^on can be into a g^ such as fee polyhednn gene, by honxdogous double crossover 
lecondbiQation; insolioQ can abo be into a restridion enzyme site engineosd into fee d^iiedbacaloviais gene. Miller et aL, 
(1989), Bioessqys ^:91.The DNA sequoK^ when cloned in place of the polyhedrin gene in fee e?qpiession vector, is flanked 
bofe y and 3' by polyhedrin-^)ecific sequences and is positioned downstream of the polyhedrin promote. 

45 The newly f<med baculoviras esqKesacMi vector is subseqoeitiy padcaged into an infectious lecodbanaoot baculovirus. 

Homologous lecombination occms at low fiequency (betvreen ^out 1 % and about 5%); thus, the msgority of fee virus produced 

ate cobmfection is stiD wild-type vnus. Tb^oiCj a 

eipessioa ^sfcm is a visual scieen albwing re^^ 

by fee native vims, is pioduced at veay higjb levels in fee 
SO pciyheddnproftdn forms ocdusion bodies feat also contain aribeddedpartides. Ibese ocdusianbo£es, up to IS pmin size, are 
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highly lefiacde, giviqg txm a hight shii^ dppeasmxi fliat is leadity visualized under Ihe li^ microscope. Gels infected with 
leOHiibinaiit viruses lack ocdusi(Mi bodies. To (Mnguishiec^aDlsnaiit viius fimi wild-type virus, the transfection sapemsiant is 
plaquedoDlo aiBoodayer of ii^ect cdlsfay tedimqQesIaK)mto1^^ &e plaques are screened under 

the light mioxmipe for presence ^mdicalive of \rild-t^ virus) or absence ^calive of ]?eoQnabit0Qt virus) of oodusion 
5 bodies. "Current Protocols in Mctobi<iQgy" Vd 2 (Ausubd ^ aL eds) ^ 16.8 (Sapp. 10, 1990); Summets and Smilii, si^a; 
Milla:etal.{1989). 

Rec(xnbinant baculoviius ejpession vedois have been developed for infection into several insect cells. For exanople, 
lecombsnant bacdovinises h^ been developed for, inter alia: Aedes aegypti , Autographa califomica, Bombyx moriy 
Drosophila melanogoster, Spodopterafivgiperda, and Trichoplusia m (WO 89/046699; Carbonell et al., (1985) J. Virol 
10 5(5:153; WIi^t(1986)^il^urei2i:718;Sn}i^hetaL,(1983)Mo/. CeiZ5wU:2156;andseegeneiany,Ftaser,efa/. (1989)//2 
Vitro Cell. Dev. Biol 25225). 

Cells and cell culture nnedia are commercially available for both direct and fusion e?q)iessiQn of heterologous polypqjiides in a 
bacuiovinWe7qpressi(m^^sti^cellcuItu^ Sunxners and Smilfa 

supra, 

IS Hie modified insect ceSs may ften be grown in an appmpdate nutdent ine£um, vitiiAi aDows ht stable maintenanoe of flie 
plasmid(s) presect in fee modified insect host Whafe Ihe esqoiessiaa product gene is under indudble control, the host may be 
grown to Ugh deosify, and e?qpiession induced Alternatively, vAsxq e^qHession is constitutive, Ihe product wQl be continuous^ 
oqxessed into the medium and nutr^ 

auginentiiig depleted nutrients. The product may be purified by sudh techniques as dHomatpgr^y, eg. HPIX!, affinity 
20 chrranafcogr^y, ion exchange chromatogrs^ihy, etc,\ dectrqphoiesis; density gradient centrifijg^tion; solvoit extraction, etc. As 
^xpx>priate^ &e product may be fiirth^ puri&d, as lequtred, so as to rmove substantially ai^ insect protdns which so^e also 
present in the medium, so as to provide a pcoduct which is at least substantially fiee of host deibds, eg. protdns, lipds and 
polysaccharides. 

In order to obtain protein e^qiression, recombinant host cells daived j6wn the transformants are incubated under conditions ^ch 
25 allow e;q)ressian of fee lecomhinant protein aicodir^ sequaice. These corniitions will vajy, depeadrait \xpm the host cell 
selected However, the ooiMfitUHis are isadily ascertain 
art 

iiL Plant Systems 

There are maay plant cdl culture and whole ptat genetic cjqHession sysbans known in the art Ex^iflary plant ceMar genetic 
30 ©qjressicMi systems inchde those described in patents, such as: US 5,693,506; US 5,659,122; and US 5,608,143. Additi(MiaI 

examples of gaKdc ©qnession in plant cell culture has been described by Zenk; Phytochemistry 30:3861-3863 (1991). 

Desc3:q)ticH)S of plant proteiii signal peptides may be found in addition to the references described above in Vaulcombe et dL,Mol 

Gen. Genet, 209:3340 (1987); Chandler ^ aL, Plant Molecular Biology 3:407418 (1984); Rogers, J. Biol Chem. 

260:3731-3738 (1985); Rodistein et al., Gene 55:353-356 (1987); WMttia: et al., Nucleic Adds Research 15:2515-2535 
35 (1987); Wirsel et aL, Molecular Microbiology 3:3-14 (1989); Yu et al., Gene 122:247-253 (1992). A description of the 

legulaticm of plarit g^ epessico by the i^rytd^^ 

be fi)undinRL Jaiesand J. MacMHlin, Gibbetdlins: in: Advanced Plant Physiology,. MalcohnB. \WIkins, ed., 1984 Htman 
PuUishii^ limifced, LcxkIom, pp. 21-52. References to describe other metaboKcally-iegulated genes: Sheen, Plant Cell, 
2:1027-1038(1990); Nfaas et al., EMBO J, 9-3447-3452 (1990); Benfcel and Hfidc^, Pi oc. NatL Acad. Set, 84:1337-1339 
40 (1987). 

lypjcaOy, using techniques loKiwn in art^ 

genetic r^iulatory elanents desigr^ for operation in plants. The e^qjression cassette is inserted irrto a desired ejqpressicm vector 
widi coapmicKi sequ»]ces ipstream and downstream fiom the eTcpression cassette suitaUe f(x expussam in a ^ant host The 
companicn sequences win be of plasmid or viialodgp^ 

45 move DNA fimi an original doning hosi, sudi as bactaia, to the desired plant host The basic bacfcaial4>lant vector constmct will 
pneferaUy pax)vide a iHoad host range piokaryote relocation origin; a prokaryote sdedable mado^ and, foe Agrobacterium 
transfoaomations, T DNA sequoioes for AgtobEdedumrmediafed transf^ to plant dsomosomes. Where tte hetendc^us gene is 
not readily amenable to detection, tte constmct will prefaably also have a selectable maricer goie suitable for detamining if a 
pbt cell has been transfomoed A gjenoal ie\dew of suitable mad(^^ 

50 WibrinkandDons, m,PlantMoL Biol. Reptr, 11(2):165-185. 
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Sequmces suitable for pmittisg int^cattoa of fhe hetetologons sequ^ice into the plant goiome aie also lecommeDded Ibese 
mi^ include tansposcHi sequcDoes and like fer homologous leoomhinatiGii as wdS as Ti sequences wUdi permit landom 
insotion of a heterologous exptessjon cassette into a plant genome. Suitable ptokaryote selectable markers indude resistance 
toTvaid antibiotics such as ampidllin or tetracycline. Other DNA sequences encoding additional fimctioos may also be present in 
5 the vector, as is known in the art 

He nucleic add molecules of fte subject invention way be induded into an exprcsacm cassetle for expression of the pK)tfiin(s) of 
interest Usually, there will be ody one raqjressiQn cass^ althon^ two or more are feasible. The lecomteiant e^qHesacm 
cassette will contain in addition to ftie heterolo^s protein encoding sequence the follovwng dements, a promoter i^on, plant 5 
untranslated sequences, ioidation codon dq)ending ipon wbetfai^ or not tiie structural gene comes equi{^ wifh one, and a 
10 transaction and transMontemiinatioo sequence. Umque restdctioaensgw sites at the and ? endsof tihe cassette allow for 
easy insertion into a p9:e-e?dsting vector. 

A hetmlogous coding sequence ma/ be for any protein relating to ^ present invention. The sequmoe oicocSng die protein of 
interest will encode a signal peptide allows processing and translocation of flie protein, as g^pxpiate, and will usually lack 
any sequraicev^di nmgjbt result in fhe biiiling of t^^ Since, for fte most part, the 

IS transcdptional initiation i^oQ win be a gm 

pqjtide \^ch provides for translocatioi}, one may also provide for translocation of flie potein of iitaest In this way, the 
ptotein(s) of interest will be ttan^ocated finm the cells in which they are esqxessed and niay be effidenlly harvested. Typically 
secretion in seeds axe agx)ss flie aleur(H)e or scuteflar 

the protein be seaded fixxn the ceOs in vrindi fte protm is produced, iUtates iscMcn and purification of flie 

20 recombinant protein. 

Since the ultimate esqiresaon of flie desired gene product will be in a eucaryotic cell it is desirable to determine whether any 
partioD of the doned g^e oooiains sequences whidi will be processed out as intrcHis by the host's splicosome madiinary. If so, 
dte-directed mutagene^ of fiie "intron" region may be cooduded to prevent losing a pcxtion of the genetic messa^ as a &lse 
intion code, Reed andManiatis,Cfe//41:95-105, 1985. 

25 The vector can be micrnqecteddirecdy into pl^ 

Cxossway, MoL Gen. Genet, 202:179*185, 1985. The genetic mat^al may also be transfened into fee plant cell by uang 
polyetbylene glycol, Krens» et al^ Nature, 296, 72-74, 1982. Anolfa^ medKxl of introduction of nucleic add s^noents is high 
velocity ballistic penetradioQ by smaU partides wife 

surface, Klein, d aL, Nature, 327, 70-73, 1987 and Knudsen and MuUer, 1991, Planta, 185:330-336 teachii^ particle 
30 bombardment of bailey endosperm to create tran^enic badey. Yet another method of introduction would be fiiSLon of protoplasts 
with odier entities, dte numodls, cells, lysosomes or ofeor fusiUe Ipd-sui&ced bodes, Fial^, et aL, Proc, NatL Acad Sd, 
1/^4,79,1859-1863,1982. 

The vedor may also be iriroduced into the jdant cdls by electroporation. (Fromm al., Proc. Natl Acad. Sci. USA 82:5824, 
1 985). hi this technique, plant protc^dasts are electroporated in the presence of plasmids containing &e gene construct Electrical 
35 impidses of hi^ field sbengdi rev^siUy penneabilize ticmembfanes allowing flie introduction of the jdasmids. Eectroporaled 
plant prob^lasts refonn fee cdl wall, divide, and form pto 

All plants fixan which piotoj[dasts can be isolated and cultured to give vAxAg regenerated jdants can be transformed by &b present 
invention so Host whok plants are lecovesed wlndi contain the transfened gene. It is known ftat piadically all plants can be 
regenerated fimn cultured cells or tbsues, induding but not limited to all m^or spedes of sugarcane, sugar beet, cotton, fiuit and 

40 other trees, li^[umes and vegdaUes. Some suitable plants include, for example, spedes fim &e gooera Fragaria, Lotus, 
Medicago, Onobrychis, TrifoUam, THgonellay Vigna, Gtrus, linum, Gerantum, Manihot, Daucus, ArabidopsiSy Brasska^ 
Raphanus, Sincpis, Atropa, Cq>sicum, Datura, Hyoscyamus, Lycopersion, Nicotiana, Solanum, Petunia^ Digitalis^ 
Majorana, Cichorium, HeUanthus^ LactucUy Bromus, Asparaffjs, Antirrhinum^ HererocaUis, Nemesia, Pelargonium, 
Pamcumy Penmsetuniy Ranunadus, Senedo, SalpiglossiSy Cucums, BrawadHa^ Gfydne, Loiiumy lea, TriHamx, Sorghum, 

45 zsdDatura. 

Means for r^en^on vary ficHn spedes to spedes of plants, but generally a su^mion of tiansformed protqilasts containing 
cGptes of heteiologous gene is fiist pin^d^ 

rooted Alternatively, embyo fcmediaQ can be induced fiom flie protoplast su^^ension. These eml^os g^minatB as natural 
enjnyos to fmn plants. The culture media will g^mlly oxitain various armo adds and bomH»)es, sudi as auxin and cytddnins. 
SO ft is ^advantageous to add gtataniic add and p»line to &e medium 
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roots DDtmafly develq) sinadtaneausly.Effideiitregen on flie geiM)lype, and on fte history 

fte cutture. If flKse llisee vanafai^ 

In some plant cell culture systems, Ibe deaied piotBin of the inv^on may be exaeted or ahjemativdy, fee pDtEin imy be 
extracted fitan the whole jhcL Where & desired jxoteiQ of the invention is seaeted into the medrai, it may be cdleded 
5 Alternatively, the anhryos and onhryolesfr-half seeds or oftiar plant tissue msy be medianically disnipted to idease doay seaetied 
pjotein between cells mid tissues. The mixture may be suspended in a buffer solution to i^eve soluble proteins. Conventional 
protein isolaticn and purification methods will be to 

QxygQi, and volumes will be acjusted Ihrougji routine methods to optimize ejq>ressioo and lecovay of hetKologous profceaa 
iv. Bacterial Systems 

10 Badeckil expressiotttediniques are known in the art A bacterial promoto* is any DMA sequence cqjable of Wnding bacterial 
RNA polymerase and initiating the downstream (3') transcr^on of a coding sequence stractuial gene) into mRNA. A 
promoter will lave a transccptiwi iniliatim r^on whidi is usually placed proximal to flie ? end of fte codirig sequence. Thfe 
tiansci?)tkm initiatioa legioii usually indudes an RNA polymerase binding site and a tcansoDption initiation ate. A bactmal 
{ffomoter may also have a second domain called an opmtor, lhat nmy overiap an adjacait RNA polymerase binding site at 

1 5 which RNA synthesis begins. Tbe operator permits negative regulated (inducMe) transaription, as a gene rqsessor protein may 
bind Ihe operator and fliereby inhihit transcriptirai of a ^mfic gooe. Constitutive ejqjressian may occur in the absaice of negstive 
regulatory elemeaits, sudh as the opoator. Jn addition, positive n^ulation may be achieved by a gene activator protein binding 
sequence, >rfidh, if present is usually proximal (?) to tiie RNA polymerase hindiiig sequooce. An example of a g^e activato 
pn)tein is the catabolite activator protein (CAP), whidi he^ initiate transcription of Ihe lac opeaxxi in Escherichia coH ^. coli) 

20 [Raibaud et al. (1984) Annu. Rev. Genet. i&173]. Regulated expression way lhfia:efQre be either pos^e or n^ve, flieid)y 
either eijiandng redudr^ transcxqitioiL 

Sequences encoding metabolic pathway enzymes provide particulady useM promota: sequences. Exaq[Aes include {XDmoter 
sequences dmved jSxwn si^ metabolizing rar^^naes, such as galadosCi lactose ([ac) [Chang et al. (1977) Nature 795:1056], 
and maltose. Additional examples include jwomoter sequaices derived fiom biosynthdic enzymes such as tryptt^ian (ftp) 
25 [Goeddel et d. (1980) Nuc, Adds Res, ft4057; Ydverton et al (1981) NucL Adds Res. P:731; US patoit 4,738,921; EP-A- 
0036776 and EP-A-0121775]. The g-laotamase {bla) pomoter system [Wdssmann (1981) "The cloning of int^ferm and other 
mistakes." In Interferon 5 (ei L Gresser)], bacteriophage lambda PL [Shimatake et al (1981) Nature 2P2:128] and T5 [US 
patait 4,689,406] promoter systems also provide usdul promoter sequences. 

In addition, synftietic promoters \\4iich do not occur in nature also fimction as bacterial promoters. Fa: exanfile, transcription 
30 activation sequences of one bacterial or bacteriq)hage promoter may be joined witii Ihe operon sequences of another bacterial or 
bacteriophage promoter, aeatiiig a synflietic hyliid promoter [US pat^ 4,55 1,433]. For exanple, the tac promote: is a t^feid 
trp- lac promoter conpised of both trp promoter and lac q)eron sequences that is r^ulated by the lac repressor [Amann et al, 
(1983) Gene 25:167; de Boa* et d. (1983) Proa Natl, Acad. ScL 80211 Furthemacxe, a bacterial prraaoter can include 
naturally occumng ptomolere of non-bacterial origin that have &e ability to Wnd bacterial RNA polymerase and initiate 
35 ttar^ption A n^raafly occumng iranotiar of n(»>bact^ 

produce high levds of e^qxession of some gqnes in jxokaryotes. The bacteriophage T7 RNA polymerase/^itomoter system is an 
exanpleof acoi5iedpronKtosystem[Studia:ef fl/. (1986) J. JIM BioL 189:113; Tabor a/. (1985) ProcNaO. Acad ScL 
52:1074]. hi ad(Stion, a l^brid promoter can also be conpised of a bacteriqphage prwnoter wi an Kcdi operaliffl: legicm 
(EPO-A-0 267 851). 

40 In addition to a fimctioning promotEr sequence, an efifident libosome hindii^ site is also useM for fiie expression of foreign geoss 
in prokaryotes. hi Kcoliy tiie tibosome bindii^ site is called fte Shin&-Dalgamo (SD) sequence and indudes an initiation codon 
(ATG) and a sequence 3-9 nucleotides in length located 3-1 1 nucleotides upstream of the initiation codon [Shine et d. (1975) 
Nature 254'M]. The SD sequence is thought to pomote binding of mRNA to the ribosome by the pairing of bases between the 
SD sequaioe andtiie J andof £.c«rff I6S1RNA [Stritz et al (1979) "Genetic signals and nndeotide sequaices in messrajgor 

45 RNA." hi Biobgical Regulation and Development: Gene Expression (ed RF. Goldberger)]. To express eukarydic genes 
and pidsuyotic gpnes with weak libosome-binding site [SambrocJc et d. (1989) "Expression of cloned genes in Esdiericlm 
cdL" h Molecular Cloning: A Laboratory Manual\. 

A DNA mokcule m^ be expressed inttacdhMy A pKinKte sequoice may be diiecfly linked wifli the DMA molecule, in \diidi 
case flie first amino add at fte N4mraius will ahv^ be a methionme, which is encoded by the ATG start codon If desired, 
50 methionine atlheN-tamiiiusmsy bedeavedjfomthepx)^ in vitro incubation with cyanogen bromide or by dther in wo 
oain vitro nxd^atioa wifliabadedalnKSHQaine>t^^ P^-A-0 219 237). 
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FuaoDi protems provide an ahemative to direct exprcssion. Usually, a DMA sequence encodii^ fte N-teoHinal poilioii of an 
endogenous bacfaial piotdn, or other stable p^tein, is &sed to the ? end of heterologous coding sequences. Upm exp:ession, 
fiiis construct will provide a fusion of the two amino add sequences. For example, Ihe bacteriophage lambda cell geoe can be 
linked at flie 5' tenninus of a foieiga gaie and expressed in bacteria The resulting fiision profcan preferably retains a site for a 

5 procesang enqroe (6ctor3&) to deave the bacteriophage protein fiomfhe foreign gene [Nagai et al (1984) Nature 309:810]. 
Fusion pioteim can also be made with sequaices from the lacZ pia et al. (1987) Gene dft 197], trpE [Allen et aL (1987) J. 
BiotechnoL 5:93;Makoff al (1989)7. Gen. Microbiol, 135:111 and Chey pP-A-0 324 647] genes. The DNA sequence at 
fte junction of flie two amino add sequences may or may not encode a cleavable site. Anoiha: example is a ubiquitin jSisioa 
pffoteia Sudi a M(m protdb is made with the ubiquitin regicm ftat preferably retains a site fw a procesang enzyme ^g. nbiquitin 

1 0 spedjSc processing-protease) to cleave the ubiquitin fiom the foreiga protein Through this method, native foieiga protein can be 
kdated IMBDerfif fli (1989) Bia/Technohgy 7:698]. 

Altemafivefy, fordgaprotdns can alsobe seaeted fiom the cell by aealing dmeric DNA mdecules that encode a iusionpnotein 
comprised of a signal peptide sequence fragment that provides for secaetioo of flie foreign protein in bacteria [LB patent 
4336^36]. The signal sequaice ftagmsnt usually encodes a signal pq)tide conpised of l^rAophdhic anrino adds which direct 
15 ftesecacetionofteprotem fiom the ceU. The pxjtem 

periplasraic space^ located between the inner and outer nmteane of the cell (gram-negative bacteria). Prefaably there are 
processing sites, which can be cfeaved rfflier m vivo or in yitro aicoded between flie signal pqptide fiagnrat and the foidgn 
g3«. 

DNA encoding suitable signal sequsices can be derived fiom genes for seaeted badmal proteins, sudi as the E.coli outer 
20 membrane protein gene ipmpA) [Masui et al. (1983), in: Experimental Manipulation of Gene Expression; Ghrayeb et d, 
(1984) EMBO J, 3:24371 and flie Kcdi alkaHne jiioqAatase signal sequence ^hoA) [Oka et al (1985) Proa Natl. Acad. 
ScL 52:7212]. As an addi&oal examfie, flie signal sequaice of flie alpba-amylase gene fixm various Bacillus strains can be used 
tosecreteheterologousproteinsfi»mJ?.5wMtflPalva^fl/.(l^^^^ Acad.ScL USA 7P:5582; EP-A-0 244 042]. 

Usually, transccQJtion tenninatkHi sequences recognized by bacteria are regulatory regions located 3* to flie translation stop codon, 
25 and fliustogeflierwifti flie pramoter flank flie codi^ 

be tramlated into flie polypeptide oicoded by flie DNA. Transcriptiaa terminatian sequaices fireqjKsnfly include DNA sequences 
of about 50 nudeotides capatie of fimnhig stem loop slmctotes fliat aid in tenninatrng transa^on. Examines include 
ttanscrqjtion tamtnatioa sequences derived jfixm genes wifli strong ptamotaB, sudi as flie &p geoe in E.coti as weD as oflher 
biosynfiidicgQies. 

30 Usuafiy, flie above described componails, conaprising a promota:, signal sequence Cif desired), coding sequence of interest, and 
transcription temmnalion sequence, ate put tog^her into e?qxession constmds. Exp:ession oMistnids are oflm mairtained in a 
repHcon, sudi as an extradnomosomal elemait plasmids) capable of stable maintenance in a host, such as bacteria The 
repKcon wiB have a i^lication systm, flws attowing it to be ma^ 
and am(Mcatioa h additioo, a lepBcon may be dfliff a Kgh OT 

35 goaeially have a copy numba: ranging fiom about 5 to about 200, and usually about 10 to about 150. A host containing a high 
copy nwnberplasnrid will preferaHy contain at least about 10, and more pefetably at least about 20 plasmids. Eflier ahi^ or 
tow copy nuni)er vecfior may be sdected, dqpo^^ 

Altematively, flie expression constrocts can be int^rated into flie baderial genome wifli an integrating vector. Integrating vectors 
usually contain at least one sequence homdogpus to flie bacterial dirranosome fliat allows flie vector to int^jate. Intesgrations 
40 appear to result fiom recombinations between homologous DNA in flie vedOT and flie bacterial dmonosome. For example, into- 
grating vectors abstracted wife DNA fiom various BaciBus strains integrate ioto the Bacillus chromosome (EP-A- 0 127 328). 
iriegrating vectors may also be conpised of bacterioph^e or trar^poson sequences. 

Usually, extradaxMnosomal and integrating ©pession axistructs may contain selectable mariflsrs to allow fcH: the selectiwi of 
bacterial strains fliat have been transfimned Selectable mariters can be ©qjressed in flie bact^al host and may indude ga^ 
45 which render bacteria reastant to drags such as anpdllin, dJorannteicol, eryflirranydn, kanamydn (neomydn), and 
tetiacydine [Davies et al. (1978) Anmi Rev. Microbiol 32:469]. Selectable markers may also include biosynteic gqnes, sudi 
as fliose in flie histifSne, tQ^xfhan, and 1^^ 

Altjematively, seme of the above described c(MH)onents can be put togeflia: in transfonnatiQn vectors. Transfonnation vectors are 
usually comprised of a sdectaHe madffit fliat is eiflier mainferined in a iqiliom or developed into an integrating vector, as 
50 described above. 
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Ejqpressiaa and tomsfoimatioii vectors, either exiia-diioniosomal leplicons cr integrating vectors, have been developed for 
tcansfcsinatiotn into umy bacfeiia. For example, esqpiesaioii vedois have been devdoped far, inter aUa, foJkmng hactoda: 
Badlbs sabtilis [Mva et al (mi) Proa Natl. Acad, ScL USA 7P:5582; EP-A-0 036 259 and EP-A-0 063 953; WO 
84/045411 Eschaichia coli [ShimatakB et al, (1981) Nature 292:m; Amann et al. (1985) Gene ^0:183; Studier et al (1986) 
5 J. Mol Biol 1*9:113; BP-A-0 036 776JEP-A-0 136 829 andEP-A-0 136 907], StreptooocciK asmm [PoweJl et d. (1988) 
Appl Environ. Microbiol 54\655\; Streptococcus lividans [Powell et al, (1988) Aj^l Environ. Microbiol 5^:6551 
Streptomyces lividans [US patent 4,745,056]. 

Methods of introdudng exogenous DNA into bacterial hosts are well-known in the art, and usually include dfeer flie 
irangfarmgrimi nf badaia treated with CaQ or oBier agents, sucih as divalent cations and DMSO. DNA can also be introduced 

10 into haderial cells by electnoporatioa Tmsformation procedures usually vary with the bacfcarfal species to be transformed See 
eg. [Masson et al, (1989) FEMSMicrobbl Lett 60'JB; Paka et al, (mi) Proa Natl Acad, ScL USA 7P:5582; EP-A-0 
036 259 and EP-A-fl 063 953; WO 84A)4541, BadJhis], [Miller et al (1988) Proa Natl Acad. Sci 85:956; Vf^ng et al, 
(1990) 1 BacterioL 772:949, Canpylobacter], [Cohen et al, (1973) Proa Natl Acad. ScL dP:2110; Dowct et al. (1988) 
Nucleic Acids Res, 16:6W; Kudjner (1978) "An inpoved method for transfomaaticm of Esdieridiia coli with ColEl-derived 

15 liasmids. In Genetic Engineering: Proceedings of the International Synqfosium on Genetic Engineering (eds. RW. Boyer 
and S. Nicosia); MsaM et al (1970) / Mol. Biol 55:159; Taketo (1988) Biochim, Biophys. Acta P4P:318; Escherichia], 
[Chassy et al (1987) FEMS Microbiol Lett, 44:172 LactobacQhis]; [Fiedler et al (1988) Anal, Biochem i7ft38, 
Pseudomonas]; [Augqstin et al. (1990) FEMS Micr(Aiol Lett 65:203, St^hjiococcus], [Barary ^ al. (1980) J. Bacteriol 
144:6%; Hariander (1987) "TiansformatiQn of Str^Jtococcus lactis by electrqporation, in: Streptococcal Genetics (ed. J. 

20 Ferretti and K Curtiss UJ); Peny et al, (1981) Infect. Immun. 32:1295; Powell et al (1988) Appl Environ. Microbiol 
54\65S\ Somkuti et al. (1987) Proa 4A Ew. Cong. Biotechnology J:412, Streptococcus]. 

v.YeastExpiession 

Yeast e?q)ressiQa systems are also known to one of ordinary skill in the art A yeast promote is any DNA sequence c^jable of 
binding yeast ENA polymaase and initiating fee downstream (3") transcriptiQn of a coding sequence ^g, stmctuial gene) into 

25 iriRNA A pionwtEr ^ have a transaiptiott initiatioii r^jcm vMdi is usually placed poximal to the S' end of the coding 
sequence. IMS transoc^jtion initiation r^on usually includes an RNA polymerase hiniing site (the 'TATA Box") and a 
tianscE5)tioii initiation site. A yeast promote: may also have a second domain called an ipstream activator sequence (UAS), 
\iiiich, if pieseii^ is usually dis^ to the stnictural gene. The UAS pennits n^giilated inducible) e?q)ression. Constititive expression 
occurs in the *senoe of a UAS. RegqlatBd 

30 transad^on. 

Yeast is a femaenting organism wifti an active metabolic pathway, flierefate sequences encoding enzymes in (he melabdfc 
pathway provide particukiy usefiil ptcMnotar sequaices. B^ples indude alcohol dehydrogenase (ADH) (EP-A-0 284 044), 
enolase, ghicokinase, gJucose^phosiAate isomoase, glyceraldehyde-3-phoq)hate-dehydrogpnase (GAP or GAPDH), 
hexokiiiase, pho^hofiuctoldnase, 3i)lK)splK)glyceiate mutase, and pyruvate kinase (PyK) pO-A-0 329 203). The yeast 
35 PH05 gene, encoding acid jJwsfhalas^ al. (1983) Proa Natl. Acad. 

Sci. USA 80:1], 

hi atMition, syn&etic promoters y/bidi do not occur in nature also function as yeast promoters. Fot winsph, UAS sequences of 
mie yeast promoter may be jdned wifli fte lianscriptbn activation region of anotfaa* yeast promoter, creating a ^iithetic hybrid 
pxmte. Exanples of sudhiltyhrid promoters include flie ADH r^ulatory sequence Med to flie GAP transcrqjtion activation 

40 region (US Patent Nos. 4,876,197 and 4,880,734). Other examples of hybrid promoters indude promotecs which consist of tiie 
legqlatoiy sequeaces rfetfherflKilDffi, GAL4, GALIO, OR PH05 g^es, confined wit the tansa^jtional activation regim 
of a glycolytic enzyme gene sudi as GAP or PjK ^-A-0 164 556). Furfhennore, a yeast promoter can indude iMteally 
occurring promoters of non-yeast origin tat have te ability to bind yeast RNA pofymerase and initiate transai^jtim Exanples of 
sudipxHDOtas include, inter alia, [CxHom et d. (1980) Proa NaO. Acad. ScL USA 77:1078; Henikoff et al. (1981) Nature 

45 283:Si5; HoD^iberg et al (1981) Curr. Topics Microbiol Immunol 96:\\9; HoHenbog et al (1979) 'The Expression of 
Bacfcaial Antibiotic Resistance Genes in the Yeast Saccfaaroiniyces cereviaae," in: Plasmids of Medical, Environmental and 
Commercial Inportance (eds. KN. Timmis and A. Puhla); Macejau-Puigslon et al (1980) Gene 11:162; Panflrior et al. 
{mO) Curr. Genet 2:Wi\. 

A DNA moleojle may be ©qjressed intracdhMy in yeast A promoter sequence may be directly linked wit the DNA molecule, 
50 in whidicase te first amino add atthe N-temnnus of the rec^inantprotdn will always be a metiontne, wUdiis oicodedby 
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the ATGstMcodon. If desired, iiidii(Miine^ wfro incubatkm wifli 

cyanogen hroniide. 

Fusion proteins provide an ahematrve for yeast exptessioa systms, as well as in mammalian, tecdoviras, and bacterial 
e5q)ressioii systeos. Usually, a DNA sequence eaxxj&tg N-teraitnal potion of an aidogenous yeast protein, ot oBier stable 
5 protein, is fiisedtofhe ? endof heteaoologous coding sequ^ices. Hxm expression, tins constract wiD provide a fiiaon of fte two 
amino add sequences. For example, tbe yeast or human superoxide disrautase (SOD) gene, can be linked at Ibe 5* traminus of a 
fixdga geaje and expessed in yeasL The DNA sequaice at fte 

a cleavable ate. See eg, EP-A-0 1 96 056. Anoflier example is a uHquitin fiisioa protein. Sudi a fusion protein is made with flie 
ubiquitin region ttiat peferaUy retains a site for a processing arzyme fe. ubiquitin-q)ecific processirig protease) to deave die 
10 liriquitinfiomte fradgnprolda Througjifliis mefcod, thaEfae, native findgn proton can be isdated fe. WO88/024066). 

Alternatively, foreign poteiiB can also be seoded fian the cell into flie growth media by a:eating diimeric DNA molecules that 
encode a fosion protdn coopised of a leader sequence ftagmait ttiat provide for seaetion in yeast of Ihe foreign pootda 
Piefaabfy, there arc processing sites encodedbetweaifhe leader fi^^ 

or in vitro. The leada* sequence fiagment usually encodes a signal peptide comprised of hydroiliohic amino adds yMdi direct 
15 teseo^rfonofthe protein 6m Ihe cell. 

DNA «icoding suitable signal sequences can be derived fiom genes for secreted yeast proteins, such as flie yeast invettase gene 
05P-A-O 012 873; JPO. 62,096,086) and the A-fedor gaie (US patent 4,588,684). Altemativdy, leadas of non-yeast origm, 
such as an intrferm leader, exist that also provide for secax&m. in yeast P^-A-0 060 057). 

A prefejred class of secretion leaders are those that anploy a fragment of the yeast a^pha-fedor gene, which contains bofli a "pre** 
20 signal sequence, and a '*pro" region. The types of a^-fector fiagments ftat can be enployed inchde the M-lai^ pr^-pro 
alpha fector leader (about 83 amino add reddues) as well as tmncated a^ha-factor leaders (usually about 25 to about 50 amino 
add residues) (US Patents 4,546,083 and 4,870,008; EP-A-0 324 274). Additional leaders en^doyii^g an a^ia-factor leader 
j&agm^ fliat provides for secretion include hybrid aljto-ficlDr leaders made with a presequoice of a first yeast, but a ixo-iegion 
firan a seocMid yeast a^AafictcMr. {eg, see WO 89/02463 .) 
25 Usually, lianscrqrtion tennination sequences recognized by yeast are regulatory regions located 3' to Ihe translation step codon, 
and thus together with flie promoter flank flie coding sequence. These sequences direct the transa^jtion of an mRNA whidi can 
be translated into fte polypqptide moAsd by Ihe DNA Exam(Jes of transa?)fion teminator sequence and oftier yeast- 
recognized tammnation sequoQces, such aslhose codir^ forgfycolytic enzymes. 

Usualfy, flie above descnT)ed con^xmits, conpising a promoter, leader (if desired), coding sequaice of inl^ and 
30 transcription temnnaticMi sequmce, are put together into ejqjresdtm constaicts. E?qpresaoDi constrads arc oftei maintained iri a 
Tq)]ic<m, such as an exttachtomosonal demmt {eg. plasmids) capable of stable maintenance in a host, sudi as yeast cs: bactaia. 
The replicon nay have two repKcaticm q^stras, ttms aHovv^ 

prokaryotic host fcff doning and an^Mcation. Examples of such yeast-baderia shuttle vectors include Y1^24 [Bolslein et d, 
(1979) Gene S:17-24], pO/l (Brafe et d. (1984) Proc, Natl. Acad. Sd USA 57:4642-4646], and YRpl7 [Stinchcomb et al 
35 (1982) 1 Mol Biol 158\\5T\, In addition, a repliam may be dflier a hi^ or low copy numbar pJasmid. A hi^ copy nmrber 
plasmid will generally have a copy number ranging fiom about 5 to about 200, and usually about 10 to about 150. A Iwst 
conlaimng a high cc^iy number plasmid win prefe^ 

or low copy number vector nay be sdected, depending upon the effect of flie vector and the foreign protein on flie host See ^. 
Bvsik&etaU supra, 

40 Atemativdy, the expression constrads can be int^rated into fte yeast gaiome wifli an integratir^ vector. Integratir^ vectors 
usually amtain at least one sequence homologous to a yeast diromosome fliat allows flie vedDr to integrate, and piefaably 
contain two homologous sequences flanking flie expressiai consbuct hit^tions appear to result from recombinations between 
hamdogousDNAintheved«M:andflieyeastdir(»nDS(^ i0i228-245]. An 

integrating vector may be direded to a spedfic locus in yeast by selecting flie gppropriate homolc^ous sequ^ fOT inclusion in 

45 the vector. See Qrr-Weav«: et a/., supra. One or more ejqiressian constrad may integcate, possil)ly affecting levels of 
rcctMnhrnantpotdn produced [Rine^rf fl/. {\m)Proc, Nati. Acad. Sd. USA «ft6750]. The diromosmal asquences included 
in fl^ vector can occur dflior as a sin^e segment in flie vecfcrar, whidi results in flie integration of flie entire vector, or two 
segmaits homdogous to adjacent spgments in flie damiosome and flanking Are expression constmd in flie vector, whidi can 
xi^t in flie staUe integration of on^ flie opes^onoons^^ 



wo 02/34771 



PCT/GBOl/04789 



-19- 

UsuaDy, extradnomosQmal and int^ratmg e?q)ressiQtt constructs may coulam selectable maikers to allow for the seleclicai of 
yeast stains to have been tansfonne^ Selectable markers may include biosynthetic genes that can be expressed in the yeast 
host, such as ADE2, HIS4, LEU2, TRPl, and ALG7, and the G418 resistance g^, which coaier resistance in yeast cells to 
tanicanycin and G41 8, legjedively La addiliaa, a suitable selectable madcer may also provide yeast wife fte ability to grow in Ihe 
5 jaesoioe of toxic oonpjunds, sudi as m^. Fot example, the presence of CUPl allows yeast to grow in the presoice of cxupper 
icms IButt a/. (1987) Microbiol Rev, 5i:351]: 

Altanatively, stMue of the above described con^wnojts can be put togethear into transfomaation vectois. Tiansfomiation vedsMS 
are usually comprised of a selectabb marher to is eifeer maintained k icpKconcr developed into an int^raling vector, as 
described above. 

10 Ejqaesaon and transfotmaiion vectors, eate extradiromosomal leplictms or integrating vectors, have been developed for 
transformation into many yeasts. For exanople, ejqxessiaa vectxxs have been developed for, inter alia, the follawii^ 
yeastsCandida aMcans [Kratz, a d. (1986) Mol Cell BioL d:142], Candida maltosa [Kunze, et d. (1985) J, Basic 
Microbiol. 25:141]. Hansaiula polymoipha [Qeeson, et d. (1986) /. GerL Microbiol 752:3459; Ro^nkan^) et d. (1986) 
Mol Gen, Genet 202:302], Huyveroinyces fiagilis Pas, et d, (1984) J. Bacteriol 158:1165], Kluyveromyces ladis [De 

15 LouvMMXJUrt et d. (1983) 1 Bacteriol 154031; Van den Berg a/. (1990) Bio/Technology 5:135], Hdhia guilkrimondii 
[Kunze et al (1985) J. Basic Microbiol 25:141], Pichia pastoris [Gregg, et d, (1985) Mol Cell Biol 5:3376; US Patent 
Nos. 4,837,148 and4,9»,555], SacchaitHiQrces cerevisiae pbnen et d. (1978) Proc Natl Acad. ScL USA 75:1929; Ito et 
d. (1983) J, Bacteriol J53:163], Sdii250sacchanMnyces pradbe DBeach and Nurse (1981) Nature 300:7061, and Yanowia 
lipolydca [Davidow, et d. (1985) Curr. Genet 70:380471 Gailkdin, et d. (1985) Cwrr. Genet 70:49]. 

20 Methods of introducing exogeoous DNA into yeast hosts are well-known in the ar^ and usually include either the transfooDatiai 
of ^heroplasts or of mtact yeast cdls treated with alkaK caticais. TranrfantnatiQn procedures usually vary wi& the yeast ^es to 
be transformed. See eg. [Kurtz et al (1986) Mol Cell Biol 6:142; Kunze et al (1985) J. Basic Microbiol 25:141; Candida]; 
[Gleeson et d, (1986) J. Gen. Microbiol 732:3459; Rpggenkanq) et d. (1986) Mol Gen. Genet 202'3QI1; Hanseraila]; [Das 
et al (1984) J. Bacteriol 755:1165; DeLouvencourt et al (1983) J. Bacteriol 154:1165; Van den Berg al (1990) 

25 Bio/Technolo& S:135; Khiyvetoniyces]; [Cregg et d, (1985) Mol Cell Biol 5:3376; Kunze et d. (1985) /. Bask Miavbiol 
25:141;USPatentNos.4,837448and4,929^55;Pichia];[Hinnener(rf.(1978)P^^^^ USA 75;l929;ltoet 

d. (1983) J. Bacteriol 755:163 Saccharomyces]; [Beach and Nurse (1981) Nature 300:7%; ScUzosaccharonQrces]; 
IPmddow et d. (1985) Cwr. Genet. 103% Gailkdin et d. (1985) Cm, Genet 70:49; Yanowia]. 

Antibodies 

30 As used herein, the tmn "antflxxfy" refas to a polypeptide or group of polypq)lides composed of at least one antibody 
combining site. An "antibody comhtairig site" is the throe^imaisiooal Wndmg qiace with an internal suifece shjqpe and charge 
distributim complementary to the features rf an egitogo of an antigo), vAich allows a binding of the antibody with Ihe antigen 
"Antibody' inchjdes, fa: exan^ie, vertdxate antibodies, hybid antibodies, chimeric antilxxiies, humanised antibodies alta^ed 
antibodies, unh^ent antibodies, Fab proteins, and singte donm 

35 Anffl)odiesag3inst1hepiDtrinsofflieinve^ 
strqjtococcus proteins. 

Antilxxiies to flie iHOteins of the inveniioo, both polydonal and monoclonal, may be prepared by conveutioDal m^ods. hi 
general, Ihe protem is first used to imnamizB a suitaHe animal, preferably a mouse, rat, rabbit or goat RabWts and goals arc 
preferred for fte pieparaticfi of pdyclonal sera due to flie volume of serum obtainable, ard fte availaMity of labeled anttrabbit 

40 and anthgoat antibodies. inmunizatiQn is gpnerally pafooned by mixing or emulsifying the protdn in saKne, preferabiy in an 
adjuvant such as Freund's om^dete a^uvant, and injecting Ihe mixture or onulsion parenterally (g^mtty subcutaneowsly or 
iobarauscularfy). A dose of 50-200 ^igTHgedion is lypcally sufiBdait hmunization is generalfy boosted 2^ weeks later with one 
or more injections of Ihe pxjtein in saline, prefmbly usiiig Freund's incon^ilelB a^'uvanL One may aBemaiively geneiate 
ariibodies by in vitro immunization using meftiods known in flie at^ which for flie puiposes of fliis inventioa is coosidaed 

45 equrvalert to in mo inomunization. Pdydoial anliseia is obtained by Heeding flie immunized animal into a ^ or plastic 
container, mcubatii^ flie Hood at 25° C one hour, followed by irwuMng at 4* C for 2- 1 8 hours. The s®um is recovered by 
ceDtriftigaticMi(eg. l,000g for 10 minutes). About 2^50 ml per bleed may be obtained fix)m^^ 
Monodonal antibodies are prepared using tte standard mefliod of KoHct & MIstem Mature (1975) 256:495-96], or a 
modifeatran ttiereof TypkaDy, a mouse or lat is immunized as described above. Howev^, laflier flm bleeding flie animal to 

50 e5diadsemn,fl«s}Jem(andoptbnallysev^ 
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splem ceUs inay be screwed (after lanoval of nonspedfically adherent cells) by applying a cell suspension to a plate well 
coated wifti fte protdn antigen. froeBs expressng manbrane-bound imnaunoglobulin ^pedfic for tbe andgai band to the plate, and 
arenot rinsed away wifli the rest of the suq^ensioo. Resulting Rcells, or all dissodated sfieai cells, aie then induced to fiase with 
nQ^d(ffiia cells to fombybridomas, and are cutt^ hypoxanthine, aminDptdn, %midine 

5 'HAT'). The lesuWng l^toridomas are plated by limitii^ diMoD, and are assayed for producticm of antibodies vAich bind 
gjedfically to fte inminizing antigen (and whidi do not Wnd to nmelated antig^). Tbe selected MAb-seaeting hybridomas ate 
then cultuied eifcer in vitro (eg. in tissue cultuie bottles or hollow fiber reactors), or m vivo (as asdtes in nice). 
If desired, Ihe antibodies (whetho: pcJydonal or mcoodcml) may be labeled \mg offlvaitional tedaiiiijes. SutehJe labels 
incbde flurophores, dmimophcxes, radioactive atoms ftjaiiicdariy ^^P and ^% electronnfense reagenis, aizymes, and ligands 

10 baviDgq)edfb binding partnas. Enzymes are typi 

detected by its ability to conveEt 3 ,3^,5 ^'-4etianis%Ibeffiidine (TMB) to a bhie pigpaart, quantifiahle wffli a spectrophotoeter. 
'^Spedfic binding paitnei" refers to a pix>tein c^ble of binding a ligand molecule with high spedfidty, as for exan^le in fte case 
of an antigen and a nmodooal antibody specific 1h^ 

Ig(j and proton A, and fee numerous receptor-ligand couples known in ttie art It should be nndastood ftat the above 
15 desdpfionis not meant to cats^orize Ihe various labels into distinct dasses, as tiie same labd may serve in sevaal diflfera:* 
modes. For exani)le, ^^^Imay serve as a radioactive label or as an eledran-dense reagent HRP may seave as enzyme or as 
antigpn fe a MAb. Furflier, one may cotiihine various labels for desired eflFecL Fot example, MAbs and avidin also reqpiie W)els 
in fee practice of ftis invention: flms, one might label a MAb wife biotin, and delect its presence wifli avidm labeled vrifti % or 
vafh an ar«»otinMAb labeled voth 
20 in tiie art, and are coDddered as equivalents wito the 

Pharmaceutical Compositions 

Itemaceutical conpositians can conpise dlher polypq)tides, antibodies, or nucldc add of the invention. The phannaceutical 
compositioos will comprise a teq)euticaily eflfective anaount of d&sac pdypqptides, antibodies, or polynudeotides of ttie daimed 
inventicaoL 

25 Tbe tem **ttexs?jeudcally efifective amourf' as used h^in refeis to an amount of a therapeutic agent to treat, amdiotate, or 
prevent a desired disease or condition, or to exhibit a detectable theqpeutic or preventative effect The efiFect can be detected by, 
for example, diemical maricets or antigen levds. Ther^)eutic eflfeds also include reduction in physical syn^tois, such as 
decieasedboib^lmpraaturc. Tlieprcdse effect^ subgects aze andheaWi, &e nature 

and extent of Ihe CQnditi(», and the Ihrapeutics or combination of ther^utics selected for admirastration. Thus, it is not usefiil to 

3 0 spedfy an exact eflfective amount in advance. However, the efifective amount for a givai situation can be detemained by routine 
experinKolation and is wiSnn Ibe judg^^^ 

FcM- purposes of present invention, an effective dose win be fiom ato 
mg/kg of the molecule of fee invaiticKi in fee indi\ddual to which it is admimstered 

A phannaceutical conix)siti(m can also contain a {toomaceutically acceptable earner. Tbe term '^phannaceudcally acceptable 
35 canief' refeis to a carrier for admhristration of a te^?€«tic ag^ sudi as antibodies or a polypeptide, gpoes, and other 

Iherapaitic agents. The term refers to any phannaceutical canier tiiat does not itself iotoce the prodiction of anlfeodies hannfiil to 

the individual recdvipg the corn)ositian, and vAndi may be adnrimstOBd without undue Umdty. Suitabte carries m^ be large, 

dowty rnetabdfeedmaaomolecules sudi as protdns, pdysacdiarides, polylactic adds, polyglycolic ad&, polymeric amino adds, 

amino add copolymias, and inactive vinB particles. Such cani 
40 Phamiacraitically acceptable salts can be used fliodn, for ©can^?le, minetal add salts such as bydtodiorides, hydrobromides, 

phoq)hates, sulfites, and the like; and flie sails of rag^mc adds sudi as acdates, pnponates, malaiales, benzoates, and the like. 

A flK»ough discussion of liiainBceudcaBy aocqplaUe exrijieris is availaWe in RflDahigI(Mfs Phannaceutical Sdences (Made Pub. 

Co., N J. 1991). 

Phannac»jtica!ly accqjlable cairias in therapeutic oMnpoations may contain liquids sudi as water, safae, ^ycerol and e&ancl 
45 Additionally, ainiliaty substances, sudi as wetting or enidsi agents, pHbufeii^ aibstances, and the Kfce, m^be presaitin 
such vdiides. Typically, the fiier^)eutic OMipositiQns are prepared as iqectables, eitho: as liquid soMchis or suq)enaon^ solid 
forais suitable for solution in, or suspajsion in, liquid vehides pdcs to iqection may also be pepared. l4)os(»nes are inchided 
wiAinte defimtion of aphamiaceutically accqplaUe caniei. 
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Delivery Methods 

Once fomailated, the ax^^ 
animals; in peiticdai^ 

Direct delivay of the cocapositicms will gpnejally be accxxiplished by injectioii, eite subcutaneously, inte^)aitoneally, 
5 iniiaveooiJsfyOTinbanKiscu^ 

lesion. OaMrmodes of adnmnsliafionialiA oial amipotoonaiy administiaaian, siqpposrtQries, andliansdanid or tianscut^ 
applications {eg. see WO98/20T34), needles, and gene gqns or Iq^pospr^s. Dc^ tteataieat may be a single dose sdiedule ex 
aiiiultq[>le dose schedule. 

Vaccines 

1 0 Vaccines aoasding to fte invaslioii may difaa: be pophylacfic i(e. to prevent infection) or fherapoitic (?e to treat disease after 
infectioa). 

Such vaccines conpise imratmising andgen(s), inmiunogen(s), polypeptide(sX pR^ein(s) (x nucleic arid, usually in comHnatiQn 
with "phamiaceutically accqjtable canieis," which include any camar that does not itself induce the jKoduction of antflxxfies 
hamM tote iodividualiecdving the c<^^ Suitable canias are typcaify largs, slowly metabolized maaomolecules such 

15 as protems, polysaccharides, polykcdc adds, polyglycolic acids, polymeric amino adds, amino add copolynoers, Ipd aggregates 
(sudi as oil dra0ets OTlqjosomes), and inactive vinis particles. Sudi cameis are well known to ftose of cidinaiy ddB in 4e art 
AddilicHialfy, these carriers may fimcliMiasimnootiostbnufe^ f acguvants''). Furthmnoie, the antigpaor imniiM^may 
be conjug^ to a bacterial toxoid, such as a toxoid fiom diphtheria, tetanus, cholera, pylori, etc. palhogais. 
Prefened adjuvanis to enhance effectiveness of flie conpositim indude, but are not linrited to: (1) oil-inrwater emulsion 

20 ffflmulaticxis (wifh or wiflbout ote ^jedfic inooMnoslinwlating agpnts such as mutaniyl pqpddes (see bdow) or bacterial ceB wall 
CQmponents), sudi as for exan^le (a) MF59™ (W09Q/14837; Ch^ter 10 in Vaccine Design - the subunit and adjuvant 
approach (1995) ed. Powell & Newnm), c(»itaining 5% Sqaalene, 0.5% Tweoi 80, and 0^% Span 85 (optionalfy containiiig 
MTP-PE) fonnuLted into suhnrioonparikes uang amicstofluidizier, (b) SAF, contaimng 10% Squalane, 0.4% Tweoi 80, 5% 
pluRHiio-blocked pdymer L121, and ttir-MDP dther miaojOuidized into a suhmicron enmlsiati or voitexed to gaieratB a larger 

25 particle size emulsi(»], and (c) Ribi™ adjuvant system (RAS), (EUbi hnraunodiQn, HamiltCMi, MI) containing 2% Squalene, 02% 
Tween 80, and (Mie or m(Ke bacterid cdl vvall coopj^ 

dimycolate (TDIVO, and cell wall sbleton (CWS), pefeiably MPL + CWS (Deto>^; (2) saponin ac^uvaris, sudi as QS2 1 or 
Sfimdon™ (CanMdge Biosdaice, Worcester, MA) may be nsed or particles geoeiated therefiom sudi as ISCOMs 
(imraunostimulating oHiplexes), whidi BCOMS may be devoid of addidanal detetg^ WO0Q^07621; (3) Comph^ 

30 Fieund's Adjuvant (CPA) and hicOTplete Freund's Adjuvant OFA); (4) cytokines, sudi as interleukins ^.g. IL-1, 11^2, IL4, 
IL-5, IL^, IL-7, IL-12 (W099/44636), etc), interferons ^g. gamma intecferan), macroph^e colony stimulating feto QA- 
CSF), tumor necrosis feto- CTNF), etc\ (5) monqiho^horyl lipid A (MPL) or S-Odeacylated MPL (3dMPL) eg. GB- 
2220221, EP-A-0689454; {6) combinations of 3dMPL with, for exanple, QS21 and/or oil-in-wafco* emulsions e.g EP-A- 
0835318, EP-A-0735898, EP-A-0761231; (7) oligonucleotides comprising QpG motifs |Krieg Vaccme 2000, 19, 618-622; 

35 KriegCtt^•o;7//IM?/7W20013:15-24;Romatte^a/.,^a^ 1997, 3, 849-^54; Weiner a/., Pi«;45 1997,94, 
10833-10837; Davis et aL, 1 Immunol., 1998, 160, 870-876; Chu et al, J. Exp. Med, 1997, 186, 1623-1631; Lipford et 
qL, £ur. J. Immunol, 1997, 27, 2340-2344; Moldoveanu et al„ Vaccine, 1988, 16, 1216-1224, Krieg et a/.. Nature, 1995, 
374, 546-549; Hinman et al, PNAS USA, 1996, 93, 2879-2883; Dallas et oL, J. Immunol, 1996, 157, 1840-1845; 
Cowdery et al, J. Immunol, 1996, 156, 45704575; Halpem et al, Cell Immunol, 1996, 167, 72-78; Yamamoto et al, 

40 Jpn, I Cancer Res., 1988, 79, 866-873; Stacey et d,, I Immmol, 1996, 157, 2116-2122; Messina et al, J. Immunol, 
1991, 147, 1759-1764; Yi et oL, J. Immunol, 1996, 157, 4918-4925; Yi et al, J. Immunol, 1996, 157, 5394-5402; Yi et 
al, J. Inmmnol, 1998, 160, 4755-4761; and Yi et al, J. Immunol, 1998, 160, 5898-5906; btemational patent plications 
WO96/02555, W098/16247, WO98/18810, WO98/40100, W098/55495, W098/37919 and W098/525811 le, coHaining 
at feast one CG dinudeotide, wifli 5-nKfliyfcytosine opticmaify bdr^ used in place of cytoane; (8) a polyoig^lene ete or a 

45 polya?^efli)4eoe ester e.g. W099/52549; (9) a polyojg^loie soAitan ester surfectant in combinatiQn wiftt an octoxynol ^g 
WO01/2li)7) or a polyoxyefhylene all^ elher or este: sur&dant in combination with at least one additional notbiomc surfedant 
such as an octo?^ol (e.g WO01Z21152>, (10) animniinoslimnlatMy oligcxiudeotide (eg a CpG oKgonudeotide) and a saponin 
e.g. WOOO/62800; (11) an inmmostimuiant and a particle of metal salt e.g WOOQ/23105; (12) a saponin and an oil-in-water 
emulsion eg W099/11241; (13) a s^nin ^.g QS21) + 3dMPL + IL.12 (optionally + a sterol) e.g W098/57659; (14) 

50 alumimum sahs, prefeiabfy l^rtodde or phosphate, but any ofhor sdtabfe Kdt may also be used (eg hydroxyphoqiiate, 
a)Qiio?dinxide, oifliophos|iiate, sotphsde efa [eg. see dj^pters 8 & 9 of Powdl & Newman]). iSfixtures of dijfeent ahmunium 
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10 



sails also be used Tlie salt imy late aoy suilabfe form ^g. gd, 
asintnunostinii^ 

As mentioned above, mmamyl pq)tid«! include, but are not limited to, N-acetyl-mu^lIn5^^L■1hceat5d-I>-iso^^ (Ihr-MDPX 
N-ace^-iK)trautan?rl-I/aIai?4-D-isoghit^ (nar-MDP). N^lnBiatnyR-alaiqrl-l>iso^nla^ 
dipaliiBto)4-m-glycm)-3-ltydrox)^^ (NtTP-PE), etc 

Hk itamnogenic coirpositions (eg. Ibe inmndising ai^ntonaiMgan^xdypepliA^^ nudfiic add, phamaceutically 
accqXaUe cania; and a(Suvaii)^iJically will conlain dihieuls, such as water, saKne, glycaol, dhanol, etc. Additionally, aualiaiy 
substances, such as wetting or emulsifying agents, pH 

Typkaify, inmmi^enic cranposW 
lorscWonii^aso^ottin,lkiuid>diidespri^ 
encapsulalsrfinliposomesfeenliancedad^ 

ImnraBgenic compositions used as vaodnes canprise an 
pdypqjtides, as wefl as any olber of flK abovfr-menlianed component 
nBanttotheadminBtration of that amount to an individual, eito 
15 OTprewrtioaThisamoiMvariesdependingupootehealthandphysicalcondilim 

group of individual to be treated fe. nonhuman primate, primatEi etc.), the capacity «f the imMlafs immnnB system to 
synfliBsize antibodies, the degree of protection desired, the formulation of the vacdne, flie treating doctor's assessment of flie 
medical situation, and ote refevant fectOB. B is expected that amount win M.m 
deteaoniaed fluDU^ routine liiak 

Ibe immmogeoic compoatioos are ooroaitioDalty adhmsteied parenteially, eg. by injedion, dfcer snbcutaneoudy. 
intramusculariy, or tiansdemiaQy/transcutaneously {eg. W098/2{)f734). Additional fotmulatioos suitable for other mods of 
adminisliation indude oral and pubMBiy Ibmiulatio^ 
sii^ dose sdiedole or a nfflliite dose sdrfde. TIk 



20 



25 As an altemative to protein-based vacdnes, DNA vacdnation may be used |g. Robinson & Tones (1997) Seminars in 
Immunol 9:m-m;DoBDdLyetaL {Wr)AnnuRevInmtittol 15:617-^; falerhetdn]. 

(yenePdiverv Vehicles 

Gene ixsapf Vfiades fw delivery of constrads indnding a codir^ sequaice of a therapeutic of fte invention, to be delivered to 
marnmd for expression in the rnammal, can be administered eiflier locaUy or syd^ 
30 iK)n-viialvedorappcoachesini«vivooretvwmodality.Expressiono^ 
inamrnalianorlslwJlogouspranDtmEx^ 

TliB inventim includes g^ie delrvay vdiides c^e of eqjressing fte contenplated mKldc add sequaices. Hk gene ddiveiy 
vefaid6 is piefeiahty a viral vedor and, more preferably, a retroviral, adenoviral, adeno^ssodated viral (AAV), herpes viral, a 
Mtmim vector. The viral vedor can also be an ashovims, cotooavinis, Qr&on^ovinjs, papovaviras, paramyxovms, 
35 parvovinis,picomaviras,poxvinis,ortogavin]sviralveclDr. SeegeneraDy, Jolly (1994) Cancer Gene Therapy 1:51-64; Kinwra 
(1994) Human Gene Therm' 5:845-852; Connelly (1995) Human Gene Therapy 6:185-193; and KapHtt (1994) Nature 
feneto 6:148-153. 

Retroviial vectors are well known in flie art and we contoi^Jate &at any rdroviral gene thec^ vector is mptoyabk in fte 
invention, including B, C and D type idrovirases, xenolropic rdrovinises (for example, NZ&-X1, NZB-X2 and *™9-l (sw 
40 CWdn (1985) J. Virol 53:160) polytropic teHovinises ^. MCF and MCT-MLV (see Kelly (1983) J. YvoL 45:291), 
qwinavinisesandkdhiruses.SeeBNATta^ 

Pordais of the retroviral gene flias?jy vedor m^ be derived fiom different rdroviroses. For example, reliovechsr LTRs veeyhs 
derived fiom a Murine SarcoHH Viras, a tRNA binding site fiom a Rous Sa^ 
Leukania Vinis, and an odgiD rf second strand sy^^ 
45 These reconlMnantietoviialvcdnBmi^ be used Id generate tran^ 

them into appropriate packaging cdl lines (see US patent 5,591,624). Rdrovinis vedors can be constraded for s«l»Bpeafic 
integration into host cdl DNA by incapotation of a drimeiic integrase enzyme into fte retroviral partide (see •W096Q7626). B b 
pieferaHe to leccmibdnant viral vecteis a iqjM 
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Padsaging cell lines suitable use wMi fte above-des(3il)ed refrovirus vectors are well known in the art, are readily prepared 
(see WQ95/30763 and W092/a5266), and can be used to aeate produco' cell lines (also tenned vector cell lines '^CLs") 
for the poducti(m of recombinant vecbr particles, Pref^ly, U packaging cell lines arc made fiom human parent cdls fe. 
HTl 080 ceBs) or nM pareait cdl lines, v^ch ete 
5 Prefened retiovirases for the constradion of retroviral gene Ihoapy vectXHS indude Avim Leukosis Vims, Bovine Lotonia, 
Vmis, MuriiK Leukfflria Vires, Mink-Cell Focus-hidudng Vrnis, Mirine Sarcorm Vire^ Reticdoaidotheliosis Vires and Rous 
Sarcoma Vires Particulady preferred Murine Leul^iaVm^ 
19:19-25), Abelson(ATCC No. VR-999),FrierKi(ATCC No. VR-24^^ 

Sarcana Vnns and Rauscber (ATCC No. VR-998) and Moloney Murine Leukmria Vims (ATCC No, VR-190). Such 
10 retrovinises may be obtained ficMn depoatraies or collections sudbi as the Amakan Type Culture CoUection C'ATCC") in 
Ro(Me^ Inland or isolated fiom known sources 

Boanplaiy known idioviial gene fter^ vectors employable in ftis invention indude those described in patent apfiicaliais 
GB2200651 EP0415731, EP0345242, EP0334301, WO89/02468; WO89/05349, WO89/09271, WO90/02806, 
WO90/07936 WO94/03622, W093/25698, W093/25234, WO93/11230, WO93/10218, WO91/02805, WO91/02825, 
15 W095/(y7994, US 5^19,740, US 4,405,712, US 4,861,719, US 4,980,289, US 4,777,127, US 5,591,624. See also Vile 
(1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 53:962-967; Ram (1993) Cancer Res 53 (1993) 83-88; 
Takanriya (1992) J Neurosd Res 33:493-503; Baba {WZyNeurosurg 79:729-735; Mann (1983) CeB 33:153; Cane (1984) 
Proc Natl Acad Sci 81:6349; and ^filler (1990) J/iimon Gene Therapy 1. 

Human adenovimi gqne fter^ vectras arc also known in fee art and employable in ftiis inveotioa See, &x example, Bedmer 

20 (1988) Biotechniques 6:616 andRosenfeld (1991) Science 252:431, and W093/a7283, WO93/06223, and W093/ff7282. 
Exenpfciry knovm adenoviral gaie therapy vectors employable in feis invaition include fcose desaibed in the above referenced 
documents and in W094/12649, WO93/03769, W093/19191, W094/28938, W095/11984, WO95/00655, WO95/27071, 
W095/29993 W095/34671, WO96/05320, WO94/08026, WO94/11506, WO93/06223, W094/24299, WO95/14102, 
W095/24297! WO95/02697, ¥094^28152, W094/24299, WO95/09241, WO95/25807, WO95/05835, W094A8922 and 

25 W095^54.*AltemativBly, adbmiisliatiQn of DNA linked to killed adenovinis as described in Qmel (1992) Hum. Gene Ther. 
3:147-154 inay be ^oyed The gene deliveiy vehicles of the invention also indude adenovirus assod^ vims (AAV) vectors. 
Leadirg and prefeoed exanples of sudi vectors fa- use in this invention are flie AAV-2 based vectois disclosed in Sriva^va, 
W093^19239. Most prefened AAV vectois ccarpise fte two AAV inverted terminal repeats in which the native ^sequences 
are modified by substitution of nndeotides, such that at least 5 native nucleotides and up to 18 native nudeotides, piefeiably at 

30 least 10 native nucleotides i^) to 18 native nucleotides, most preferably 10 native nudeotides are retained and fte lOTainirxg 
nudeotides of ftel>sequence are deleted ori^laced with non-native nucleotides. The native EVsequences of fte AAV inverted 
tenninal rqpeats are sequences of 20 consecutive nudeotides in each AAV inverted ienninal repeat (je there is one sequence at 
eadi end) which arc not involved in HP jtanatioiL The ncHtnative replacement nudeotide maybe any nudeotide other than fee 
nudeotide found in the native Dsequence in flie same positioa Oftier raployable exemplary AAV vectors are pWP-19, 

35 pWN-1, bodi of which are disdosed in Nahreini (1993) Gene 124:257-262. Another exanaple of such an AAV vedor is 
psub2oi (see Samulski (1987)/. riroL 61:3096). Anoiha- exemplary AAV vectois flieI>mHe-DTm Qxishnctian of 
the Doubfe-D ITR vector is disclosed in US Patent 5,478,745. StiU other vectors are fliose disclosed in Carter US Patent 
4,797,368 and Muzyczka US Patent 5,139,941, Chartejee US Patent 5,474,935, and Kotin W094/288157. Yet a fiirifaer 
exan^^le of an AAV vedw ai?)M>le in fliis invention is SSV9AFABIKneo, wWdi contains fte AFP enhancer and albunrin 

40 iHomtte and directs expressim predc»ninantly in the liver. Bs stnicture and constmction are disclosed in Su (1996) Hwnan 
Gene Therapy 7:463-470. Additi(»ial AAV gene Hbsaaspy vectors are desaibed in US 5^54,678, US 5,173,414, US 
5,139,941, and US 5,252,479. 
The ther^ vectors rffe fflvention also indude hap^ 

vectors containii^ a sequoice mcoding a tiqraidine kinase polypeptide sudi as those disclosed in US 5,288,641 arid 
45 EP0176170 (Roizman). AddWonal esanphry heipes sinoplex vims vectors include HFEM/[CP6-LacZ disclosed in 
W095M139 (Wistar Mute), pHSVlac described in Geller (1988) Science 241:1667-1669 and in W09(M)9441 and 
WO92/07945, HSVUs3::pgC-lacZ desaibed m Fink (1992) Human Gene Therapy 3:11-19 and HSV 7134, 2 RH 105 and 
GAL4 desaibed in EP 0453242 CBreakefielcO, and those deposited with the ATCC with accession numbers VRr977 and 
VR-260. 

50 Also contai^ilatBd are #a^gEaiBfliH^ 

Sindhis virases vedors. Tpgivimses, SemHd Forest vims (ATCC VR-67; ATCC VR-1247), Middld)erg virus (ATCC 
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VR-370) Ross Sivar wus (ATCC VR-373; ATCC VR-1246), Venezudaa eqaine oicqMtis vims (ATCC VR923; ATCC 

YR-1250- A-rcC VR.-1249; ATCC VR-532), and those described in US patenls 5.091309, 5.217,879, and WO92/10578. 

More pariiculady, Ihose alpha vims vectors described in US Serial No. 08/405,627, filed March 15, 1995,W094/21792, 

WO92/10578, W095/(r7994, US 5,091,309 and US 5,217,879 are employable. Such alpha viruses may be obtained fiom 

depoatories or collections sixii as ATCC ia RodOTlle, Msnyland OT 

tedmiques. Ptrfoabfy, ayiayinB yeOas wffliiwhced cytotoxidly are used (see USSN 08/679640> 

DNA vecte systems such as eukaiyotic %ered expessioa systois are also useM for eipessir^ Ae nudric adds of fte 

inveirti«L See W095Ay7994 far a detafled desaqjtioii of eukaryotic hyasd aqjressirai systems. PtefcaaWy, fte euiaiyolic 

leered ejqpresaott systens of invailiQtt 

Oa» viral vectes suitaHe for use in tepBsai invention iricWettiose derived fi^ ^ 
diose described in EvaiB, Nature 339 (1989) 385 and Sahin (1973) J. Biol Standardization 1:115; riiinovinis, for exaruple 
ATCC W-1110 and flwse desmljed in Arnold (1990) J(>fffiibcAemL«^^ as canary pox vrtus or vacama 

virus, fcr exanrie ATCC VR-111 and ATCC VR-2010 and Ihose described in Fisher-Hoch (1989) Proc Natl Acad Sd 
86:317; Hexner (1989) Am NY Acad Sci 569:86, Hexner (1990) Vaccine 8:17; in US 4,603,112 and US 4,769,330 and 
W089A)1973; SV40 viras, for exan^ie ATCC VR-305 and Aose desaibed in Mulligan (1979) Nature 277:108 and Madzak 
(1992) J Gen' Virol 73-1533; influaiza vims, for exarajie ATCC VR-797 and recrarfanant influajza viruses made employirjg 
reverse genetics techniques as described in US 5,166,057 and in Enami (1990) Proc Natl Acad Sd 87:3802-3805; Enami & 
Palese (1991) J Virol 65:271 1-2713 and Luytjes (1989) Cell 59:1 10, (see also Md^diael (1983) NEJMed 309:13, and Yap 
(1978) Nature 273:238 and Nature (1979) 277:108); human inrnjunodefidaicy vims as described in m'-<B86882 and m 
Buchsdiacher (1992) J. Vnol 66:2731; measles vims, for example ATCC VR-^7 and VR-1247 and those desaibed in EP- 
0440219- Aura vims, for example ATCC VR-368; Bebara vims, for example ATCC VR-600 and ATCC VR-1240; Cabassou 
virus; for'example ATCC VR-922; Chikungunya vims, for example ATCC VR-64 and ATCC VR-1241; Fort Morgan Vims, 
for example ATCC YR-924; Getah vims, for example ATCC VR-369 and ATCC VR-1243; Kyzylagadi vims, for example 
ATCC VR-927- Mayaro vims, for exan^jle ATCC W-66;Miicarribo vinis, for exarnfde ATCC VR-58^ 
Ndumu vims, L example ATCC VR-371; Pixuna vims, for example ATCC VR-372 and ATCC VR-1245; Tonate vims, for 
exanple ATCC W-925; Triniti vims; for example ATCC W-469; Una vims, for example ATCC W-37 
exanple ATCC VR-926; Y-62-33 vims, for examjAe ATCC VR-375; (Mjoog vims, Eastsm eocefiialilis vims, for esample 
ATCC VR-65 and ATCC VR-1242; Western encephalitis vims, for exanple ATCC VR-70, ATCC VR-1251, ATCC VR-622 
and ATCC VR-125% and coiooavims, fi» exanple ATCC VR-740 and those described in Harare (1966) Proc Sac Exp Biol 
Mai 121:190. 

Ddivay of composition of irivention into cells is not 

and media may be empbyed sudi as, for example, nucleic add ejqsBssion vechxs, polycationic condensed DNA linked or 
unlinked to killed adenoviros atone, for example see US Serial No. 08^66,787, ffled December 30, 1994 and Curid (1992) 
fitflw Gene TJier 3:147454 BgandMedDNA, for cxanfie see Wu (1989) JSW 264:16985-16987, encaryobc cell 
deKvery vehicles cells, for example see US Serial No.08G40,030, ffled May 9, 1994, and US Serial Na 08/404,796, dqwaUon 
of riK)top(^merizBd Mrogd malaials, hand-hdd gme transfer particle gun, ^ desoiTied in US Palait 5,149,655, ionizing 
radiation as described in US5,206,152 and in WO92ai033, nncldc charge neutralization or liision with cdl membranes. 
Additional approadies are described in Philip (199^ MdCdlBiol 14:2411-2418 and in Woffendin (1994) Proc Natl Acad 
&i 91:1581-1585. 

Partide mediated gpne transfer may be employed, for example see US Serial No. 6(VI023,867. Brietfy, 4e sequence can be 
inserted into amventiaial vectors flat contain conventional control sequences fx levd expesara, and firm incubated wifti 
synflietic Kne transfer molecules sudi as pdymeric DNAW^ 

targeting ligandssudi as asiakwosomicoid, as des^ 262:4429^32, insulin as 

desoabed iaHadcBd (1990) Biochm Pharmacol 40253-263, galadese as described in Plarik (19») Sioconju&tte Chm 
3:533-539, lactose or transfenin. 

Naked DNA may also be en^jk^ed. Exai^ naked DNA introdudiaa metiiods ace described in WO 90^11092 and XJS 
5,580,859. ^Jlafce eEBdenty nuy be improved using biod^jadaHe hSsx. beads. DNA coated latex beads are efifidenlfy 
tan^rted into cdls after endocytosis initiation by flie beads. Tte mefliod ma^ 

inaease hydiophoWdly and teeby fedlilate diaqjtiai of flie endosrane and release of the DNA into tiie cytoplasm. 
Liposcanes that can act as gene ddivery vehicles aie described in US 5,422,120, W095A3796, W094/23697, W091/14445 
andEP-524^ As described in USSN. 60/023,867, oanon-viial ddive^, the nudeic add sequences encoding apolypeptide 
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can be inserted into canveatioDal vectas tot cmtain convailianal cmtrol sequences li# levd expssao a, and flm be 
incubated wilh synflBlic gene tansfanMlecoIes such ffi 
Kited to cdl ta^etii« Bgands sudi as asialooros«nnc(»4 

file use of Kposames to encapsufcrte DNA comprising the gpne under the control of a variety of tissuMpoafic ot 
5 nlaquitously-active promoters. Further nDn-waJ ddivoy suilahle for use indndes nfidianical ddivay systems suA as flie 
^parnk described in WofeKfin et d (1994) Proc NatL Acad. Sci. USA 91(24):1 1581-11585. Moreover, fte coding 
sequence and the product of expression of such can be ddiwred to)ugh deposition of 
Other conwffltiQnal meflwds for gene delivety to can be leed for ddivay of fte 

laniMieid gene tansfer particle gun, as described in US 5,149,655; use of iooiang ladialioD for acdvaliqg tansfeired gene, as 

10 desaibedinUS 5,206,152 and WO92/11033 

Eamidaiy l^wsone and pdycationic gene delivery vehicles are fliose desoribed in US 5,422,120 and 4,762,915; in WO 
95/13796- W094/23697; and W091/14445; in BP-0524968; and in Stiyer, Biochemistry, pages 236-240 (1975) WJl 
Freeman,'san Francisco; Szoka (im)Biochem Biophys Acta 600:1; Bayer (1979) Biochem Biophys Acta 550:464; Kivnay 
(1987) Meth Enzymol 149:119; Wang (1987) Proc Natl Acad Sci 84:7851; Plant (l%9) Anal Biochem 176:420. 

15 Apdynndeotidecfflr|»ali(mcana»ipisesfha^canyeff^ 

above. For purposes of the pesaA invention, an effective dose win be ftom about 0.01 nig^ kg to 50 mg^ or 0.05 mg^ to 
about 10 mgl^ of flie DNA consliucts in iiMlividual to \^ 

Deliverv Methods 

Once fiHfflulated, fliepdyiiKieotide cQn^wHliQns of 
20 vivo,tocdbdeovedfimtesul5ect;or(3)tBW>ofee3pessiOT 
mammak or birds. Also, human subjects can be treated 

Direct defiveiy of fte conposHioiis wifl geoeralfy be accoai|Med by injectioo, aflier subcutaneous, intraperitoneafly, 
intravenously or intramusoiWy or delivered to interstitial sp^ 

feaon. Other modes of adnmnstiation include oral and pulmonary administratLon, sujipositories, and transdomal artianscutaneous 
25 appfications (eg. see WO98Q0734X needles, and gene guns or hyposptays. Dosage treatment may be a single dose schedule or 



Methods for ttee( vn« ddivs^f and idn^bntation of tiansframed cells into a sol^ 

W093/14778. Exan^Jes of cells useful in ex vivo ^iplicdions include, for example, stem cdJs, particdady hematopoeti*^ h/as^ 
ceBs, maoojiiages, dendritic cdls, OTtumtM: cells. 
30 GenaaIfy,deavayafnijdedcaddsforbo(hervn«andp»v/froai)pK 

for exanple, destonmediated transfection, caldum phostAate {xedpitation, potyhraie mediated transfection, protJ^l^ fiision, 
eledmpfflatira), eaacqpsulaliai of 4e potymideo(ide(s) in Igwsames, and direct nnaoiigection of flie DNA into nndei, all wdl 
known in^art 

Pol ynucleotide and jx^vpeptkle ph imwrinri^^l amoosithns 
35 hadditkmtotheiAamBcadicanyacceptablecarrieisandsalisdescdM 
polynucleotide and'or pc^^pqitide compositions. 
APolvpeplides 

One exanple are polypqjddes whidi indnde, wiflwut Bmitation: asiokwrosomucoid (ASOR); transfenii^ asialoglycoprotein^ 
anlibodiB^ antibody fiagmaits; feritir^ interieokin^ interferes, granulocyte, macroph^ cotoy stinnlath^ fectar (GMCSF), 
40 granulocyte colony stimulating fector (&CSF), macrophagp colony stimulating fector (^ArCSE), stem cell fector and 
erythropoietin. Vind antigens, such as envelope proteins, can also be use^ 
fee 17 amino and pqitidefiomtodroumtsfo^^ 

R,T fMniones. Atorins. etc. 

Odiagrot?BtocanbeindWedaie,forem?ie:hcHmQnes,stert»d^ 
45 add. 



wo 02/34771 



-26- 



PCT/GBOl/04789 



CPd yalkvlenes. Polvsacdiarides. etc. 

Also pdyaD^fcne ^ can be included wilh flie deaied polynucleotidea(pclypeptide& Li a ptefened enAodimenl, flie 
polyalkylene is po^ilste glycoL lii addition, mono, di-, or polysaccharides can be included. In a preferred 
edbodimentoftbis aspect, IhepolysacdBridB is dffl^ 

5 DliDids-andUnosfflnes 

The desired polymHieotide^ypeptide can also be encapsulated 



lipid encansulalion is generaly accomplished using IqxBomes wUdi are able to stoHy hind or enliap and retain nudeic acid. The 
ratio of condensed pdynndeotide to lipid preparation can vaiy but will geneiaUy be around 1 :1 (rog DNA^oonKife lipaj, or 
10 of lipid For a review of the use of liposomes as carrias fa ddhrety 

mophys-Acta. 1097:1-17; SttaubingBr(1983)Me//i.£nz)woi 101:51^27. 

lipcsomal pteparatifins for use in the present inv^ (positively charged), anionic (ne^ly diarged) ^ 

neutral preparations. Cationic liposomes have been shown to mediatB iri^^ 

Proc Nad. Acad. ScL USA 84:7413-7416); mRNA(Makme (1989) NaO. Acad. ScL USA 86:60T7-6081>, and purified 
15 tn«isaq>tion6ctnrsODeb8(1990) J. BioL Chem. 265:10189-10192),infimctionalfi)nn. 

Calionic liposomes arc readily avaflable. For example, N[l-2,3Klioleyloxy)propyI]-N,N,N^iiell5tomc^ (DOIMA) 
liposomes aie available underte1iademaikl4)ofeclin,fiomGIBCOBW.,GCT^ 

ooraaBRMy available Iqwxmes indude Hansfectace (DDAB«X)PE) and DOTAP/DOPE (Boediinger). Other catromc 
liposomes can be prepared fiomreadify available materials usirig techniques wett known in the art See, eg. Szoka O^^^oa 
20 NaO. Acad. Sci. USA 75:4194^198; WO9a/11092 for a descriptian of flie synthesis of DOTAP 
(l,2-bis(de(^099)-3<1riiiK%4amn]oni^^ Iqposomes. 
Smilaiiy anifliK and neutral liposomes are readily available, sudi as fiom Avanti 

prepared'nsing readily available materials. Sudi materials include phosphatidyl choline, chdesterd, phosphatidyl eftianolamine, 
dioleqylphoqihatidyl choline (DOPQ, dioleoylphosphatidyl gjycerol QXSG), dideoyMiodiatidyl cflianolamine aX)PE) amoqg 
25 others. These malBrials can also be rnbcedwife the IXn^ 
maldDg Iflposonies usBg these rnaterials are well knovvn in tte art 

Ihe liposoirBS can comprise raiMammelar vesicles (MLVs), small unilaniellar \esides (SUVs), or large unilameBar vesidffi 
OUVs). llBMiriousl^iosrans^Bdejc add cornfJexes are prepared nsing meftiods known in the ait See eg. Straubinger (1983) 
Metk Immunol 101:512-527; Szoka (1978) Proa Natl. Acad. Sci. USA 75:41944198; Pi^ahadjopoulos (1975) Biocfum. 
30 Bhpfm Acta 394:483; Wilson (1979) Cell Yl-JI); Deamer & Bangham (1976) BUxMn. Biophys. Acta 443:629; Oslio 
(imBiochm.BU>phys.Ses.Commun.l6m^ey{W9)Proc. Natl. Acad. ScL USA 763348); Enodi & Stnttmatler 
(1979) Proc. Natl Acad Sci. USA 76:145; Fndey (1980) J. Biol Chm. (1980) 255:10431; Szoka & Papahadjopoulos 
(1978) Proc. Nad. Acad Sci. USA 75:145; and Sdiaefer-Ridder (1982) Science 215:166. 
Elipopio^ns 

35 h addition, Kpopioteins can be included with the polynudeotide/p^^ 
utilized indude: diytomicrons, HDI, IDI^ mi, and VmL Malan^ 
Also TMxlrfications of natmaUy occurring UpopiDldns can be use^ 

ddivoy of pdynndeolides to cdls expi^sing lipoprotein receptors. Preferably, if lipoprotans are mduding with fte 
pdynndeolidetobeddivere4 no oths targeting ligand is induded in fee cocixKitiai. 

40 Nalnra% occurring hpoprotdns comprise a lipid and a protein portion. TlB 
present, apoproteins A, B. Q D, and E have been isobted and identic 
design^ I9 Roman nmnends, AI, An, AIV; Q, CO, GQL 

A Kpoproldn can comprise more than oik apoprotBin. For example, naturally occurring chylomicrons comprises of A, B, C & E, 
ova lime these lipopiotEdns tose A and acquire C & R VU^^ 
45 B;andHlX(X)inpdsesq)0{xoteiD8A,C,&E 

Hie anano add of these ^wproteins are known and are described in, for example, Breslow (1985) Annu Rev. Biodiem 54:699; 
Law (1986) Adv. Exp Med. BioL 151:162; Chen (1986) J Biol Chem 261:12918; Kane (1980) Proc Nati Acad Sa USA 
77:2465; andUtennann (1984) Hum Goiet 65:231 
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rfte lipids varies in nataally occmiing lipopoteiB. For esanple, diylomicRMs compr^ '^.^f^.t'Z,^ 

c(Jositim of the lipi(k are <W to aid in corf^^ 

5 lipdsmabobecbosentofidlhatehydrop^ 

Natadhr occudng lipoprohsins can be isolalBd finm setum by dtiacentti&gatian, for instance. Such melbods mdsM m 
l^E^iim); Pitas (1980) J. Biochem. 255:5454-5460 and Mahey (1979) / CHn. Inmt 6^75a 
lipoprotedns canalsobeproducedby in vitro or lecanbinant mefliods by expression of the apoprotein genra m a desnwl best 
S^Lferemmle, AtkiiBQii(1986)^w« JJ^A-<?p^ Chem 15:403 and Radding (1958) ifiocten 5io/»ftys/Jcto 30: 443. 

10 l4,opioteinscanalsobeponliasedfiomcomnieraalsuppU^ 
Furtha description of lixpotHDS canbefovmdin WO98/06437.. 

FPol ycatimic Agents 
Pdycatiailic agenls canbe indoded, 
beddivoed. 

15 Polycaticoic agents, typically, exhibit a net positive diaige at physid^gical relevant pH and are capable of neutalEmg 
eledricd diarge of ffliddc adds to Mtate dehveiy to a desired 
ajpKcatians. Pctycalionic agaits m be 1^ 

The following ate examples of nsefid polypeptides as pdycationic agents: polylysine, polyarginine, potyomifliine, andprotanme. 

Other emite indude histones, protamines, hunm senm afc^ 
20 coat proteins fiom DNA viruses, sudi as (X174, ttansraipdoDal fedots also contain domains tbat bmd DNA and ^^^J^ 

be useM as nucleic aid condensing agents. Briefly, transcriptional fedots sudi as C/CEBP, ejun, ofos, AP-1, AF-2, Af -J, 

CPF, Prot-1, Sp-1, Oct-1, Oct-2, CREP, andTFCD contambaac dranains ftatWndDNA sequences. 

OiganKp(^catiQnicag!aits indude: q)mnine, qjemaidine, andpurttesdne. 

Tte dimeosioiis and of physical propeilies of a pdyca^ 
25 potypqitiae pdycationic agenb or to produce 

Synthetic pdycationic agents whidi are useM include, for example, DEA&dextian, pdybiene. lipofcdin™, and 



Tmmtmndia piostic Assavs 

Streptococcus anligensrflteinvenlioacanbeii*^ conversely, arti-steptoooccus 

30 antibSTcanbe used to detect antigen levels). Imraunoassays based on well defined, recombinant antig^ c^ be devdopedto 

isplace invasive diagnostics methods. Antibodies to streptococcus proteins within biological samples, mdnding for exanfte, blood 

or seium sanries, can be ddeded. Design of hi immunoassays is subject to a great deal of variaboD, and a vanety of th^ are 

known in the art Protocols for fte immunoassay may be based, for example, upon competiiMn, or direct reaction, or sandwich 

Wpe assays Protocob may also, for example, use soHd supports, or may be by immunop^ 
35 oflabdedariibodyorpolypeptid^1helabdsmaybe.forexampl^fluoiescent,d^^^ 

Assays wMdiampBfyfte signals fiomfhepid« aiB also loMwn; examples dfwMdi are assays 

engmfr^labded and medMled imnmiwassays, sudi as ELBA ass^ 

Kits suitaHe for immnnodiagnDsis and containii« appK#^ 
naterials, indiKiing the compositioos of invention, in suitable a 
40 example, suitable buffers, salt solutions, eta) required for fte condud of fee assay, as wdl as suitable set of assay instrudions. 

Nudeic Acid Hybridisation 

•Hybridizadon" refeis to fte assodation of Ivro nudeic add sequences to one anote by Miogen bonding. Typicalty, one 
sequOK* win be fixed to a soKd support and ofter win be fiee in solution Thei^ two sequ^ 
wi one another under conditions that fevor l^drogen bonding. Fadots feat afifed ftis bonding 
45 solvenl:ieadionts«¥eiatui^ time of hybridization; agitation; agents to 

sequence tofhe solid support (Denhaidt's reagent or BLOTTO); concenliation of fee sequences; nse of compounds to mcrease 
teiate of asodation of seqnences(dexlian sulfite or polyethyfene^^^^^ 
l^bridizatioa See Saotaxiket ai ^ra\ Vohime 2, di^ 9, pages 9.47 to 9.57. 
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'•ShmgejKy ' lefets to coodi^ 

(Met. F« exanyle, amttjatiai of teni^^ 

bekw fte cakabted Tm of fte hybrid under stadfy. The tempaatae and salt amditions can often be deterarined aiprically m 
preliminary eKperiments in wWdi sanyles of 

5 theavraAed under amditioiBrfdiffaeDtsbiii^^ 

VariaHes to ooDsider wben pafoiming, for esample, a Soufliem Hot are (1) the complexity of fee DNA being Hotted and (2) liie 
homology between fee ptobe and flie sequences being detected The total amcwnt of flie fiagmenl(s) to be studied can vary a 
magnitude of 10, fiom 0.1 to Ipg for a plasmid or phage digest to lO ' to lO"* g for a single copy gene m a highly complex 
eukaiyotic gaKune. For lovw c(M?de% pdymideotide^ substantially shratar blotting, hylmdizalion, and exposure times, a 

10 smaller amount of starling polynudeotides, and lower specific activity of probes can be used For example, a siirgle<»py yeast 
gene canbe ddBcted wittL an ejqwsure tioB of onfy 1 hour starting with 1 pg of yeast DNA, Hotting fcr two ho«8, and 
hyhddizing fcr 4« horns wilh a probe of irf qan/pg. For a sin^e-copy mann^ 

mfhlOm of DNA, Hot ovemi^ and Ityhridize ovena^ in fte presence of 10% dextam sulfite usir«g a pjobe of gieatsa: ftan 
10* qMite lesuhir^ in an exposure time of ~24 hours. 
1 5 Several fectais can afiect &e meMiig temperature (Im) of a DNA-DNA hybrid between 4e probe and the fiagment of interest, 
and consequent, tte qipppmte cffljditians for h^b^^ 
&e fiagment. Otter commonly encountered variables indude the ler#i and 1^ 
the ionic strength and foriiMmide cQiitent of the hybridizatioB 
an^eequatioii: 

20 Tm= 81 + 16.6(logioCi) + 0.4[%(G + Q}fl.6(%fonnamide) - 600/n-liC/amismatch). 

where Ci is the salt concentiatioQ (monovalent ions) and n is flie knglh of 4e hybrid in base pairs (slighfty modified fiom 
Meinkofli & WaU (l%4)Aml. Biochem. 138: 267-284). 

h designing a byWdizalion apemat, soob fectras affecting nudac add hytridizatioQ can be convoiiajfly altered The 
teqpaalure of fee Iq/bridizadon and washes and flie salt cMKentation duiir^ fte wadies are flie an?Aest to i^aA. As the 
25 teuBoature of flBl^ridizaJion increases ^sliir^mcy), it becoi^ 

are noihrandogous, and as a result, badground decreases. If fte ia(ficM)de^ not cranpletely homdogous wilh the 
immoMfced fiagmeri (as is fiequenfly flie case in gene femily and internes hybridizaliffli expoimenls), flie hybidiz^on 
tenmerature mjst be leAiced, and bad^round wl inoease. The tmi^^ 
band aid lh8 degree of bac^round in a sinnlar nanner. The slrir^ay of flie wa^ 

30 coDoentiaiioDS. 

hi general, c(MivaiientlyHidi2afion temperatures rnflie presaice of 50% fomoamide ate 4Z'C for a probe wifli is 95% to 100% 
homologpus to flie target fiagment, STC for 90% to 95% homology, and 32°C for 85% to 90% homology. For lower 
homologies, fcnmmide content should be lowered and temperature adjusted accordin^y, using flie equation above. If flie 
hcmilogy between flie probe and ttie target fiagmmt are not known, flie 

35 wash conditions w*kh arc notistrinee^ 

can be wadied at stringency and reejqiosed If ttie time requited for exposure mi^ 

Iq/laidizalion and/a: wadnng stringencies diouM 
NurMc Acid Probe Assays 

Mefliods such as PGR, branched DNA pobe assays, or Hotting techniques utiBziiig nuddc add pRJbes accoding to flie 
40 mvenlioncandetenninBfl«iwseaK3eof(DNAOTmiQ>l^ 

fam a di^ilex or douHe stranded coinilex, whichis stable esioi^ to be detected 

The nudeic add probes will hybodiziB to flie streptococcus nndeotide sequences of flie invailion (mchidii^ bofli sense and 
antisense strands). Though mai^ differert nuckolide sequencer 

sequmse is preferred because it is flie actual sequaice pEsent in cdls. mRNA represents a a 
45 shodd be con^ilmientary to flie codir« sequence; srr^e-sttanded dm is cffl^ 
shodd be con^knentaiy to flie nxHxiding sequence. 

The probe sequence need mt be idaitical to file stiqjtococcus sequence (ot its c^ 

and kj^ can lead to inaeased assay saisrtivity if flie nuclac add probe can fonn a diplac wifli target nucleotides, which can be 
dete(M Also, flie ffliddc add pobe can indude additijMal nndeolides to stalfe 
50 sequence nay also be hdpM as a labd to deted flie ftnned duplex. For example, a noncomplem 
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imy be atladhed to Ihe S end of pobe, with Ihe lem^^ 

seqaence. AlteOHtivdy, noiMxmplaneniaiy bases <»• longff sequences can be intespeised into fte probe, provided that the 
probe sequence has suflBdent complemenlaat^ 
fonn a diplex which canbe ddected 
5 lie exact leogfli and sequence of flie pobe wiB depend on the liybridi2aticD conditions ^g. temperatm^ salt condition etc.). For 
example, fx diagnostic plications, depoiding on fte coaplexily fte analyte sequaice, fte nucleic acid pobe lypcaHy 
contains at least 1 0-20 nucleotides, pteferably 15-25, and mote preferably at least 30 nudeolides, aMwugh it may be shorter than 
te. ShMt prinm gemalfy lequiiB cocte teniM^ 

Pldbes may be produced by qrafihetic procedures, sudi as fee triesla- method of Matteucd et aL [J. Am. Chem. Soc. (1981) 
10 103:31851, or according to Urdea etd.\Proc NaS. Acad. Sd. USA (1983) 80: 74611 or using commadalfy aivailatte 
aufrraatBd oligonucleotide synthesizeis. 

The chemical nature of the probe can be selected according to preference. For certain applications, DNA or RNA are 
sipi^sdate. For other plications, modifications may be incorpaated eg. baddwne modificaticms, such as plioq)horo1hioatES 
ormBdiylphosphomrtes,canbeusedtoinaBasemwwhaIf-«fe,aItErimafi^ see 
15 Agtawal & Iyer (1995) Ctm Opin Biotedmol 6:12-19; Agcawal (1996) WTECH 14376-387]; analogoes such as peptide 
oSeic adds layalsobeusedte. see Q«ey (1997) 2253ECffl5:224-22^,B«^ (1993) JIBTECff 11:384-386]. 

Altetnativdfy, the pofynierase dbain reaction goo is anote wdHo^ 

acid. The assay isdescribedinMuDiseffli \Meth. Emymol (1987) 155:335-350] & US patents 4,683,195 & 4,683,202. Two 
•^mna" nudeotides hybodbe wifli 4e taget nucleic adds and are used to prime fes readicm. The primers can compise 
20 sequencetiiatdoesnotl5*mdizetofliesequenceofflK;mpMcatioa1arget(cffitsaMq)l^ 
exaiDfJe, to incoiporale a coiwenientreshktion site. Typcify, sudi sequ 

A Ihennostablepofymeiaseaeates copes of target nuddc adds fiomtepimas usii« the oiginaltaigetnMidcaci^ a 
template. After a tiueshold amount of toget nuddc adds are goiaated by fee pdymerase, feqr can be detected by more 
ttaditioHl methods, sndi as Soufem blots. Whm using the Souflian blot mefeod, the labdled piobe wiU hybridize to fee 

25 streptococcus sequence (or its conplaoEBt). 

Also, mRNA or dINA can be detected by traditional Wollmg tedmiques described in Sambtook et d [supra} r^k, ot 
cDNA geneaated frommRM uaiig apdymetase aizyme, can be purified and sq)aiated using gd ekctre0ifflBsis. Heaidac 
adds (m fee gd ate feen Wotted tmto a s(M sifpat sudi as nitroceMose. The sofcd 9^ 
ften wadied to xaojve aiy Bnhybodzedp^ 

30 pnibeislabdkd^nfeaiadioadivemaiB^ 

BRIEF DESCRIPTION OF DRAWINGS 

Figures 1 to 85, 119 to 188, 238 and 239 show SDS-PAGE analysis of total cell extracts fiom 
cultures of recombinant Exoli ejqnesang GBS proteins of the invention. Lane 1 in each gel (except for 
Figure 185) contains molecular wdght markers. These are 94, 67, 43, 30, 20.1 & 14.4 kDa (except for 
35 Figures 7, 8, 10, 11, 13, 14, 15 and 119-170, which use 250, 150, 100, 75, 50, 37, 25, 15 & 10 kDa). 

Figure 86A shows the pDESTlS vector and Figure 86B shows the pDEST17-l vector. 

Figures 88 to 118 and 247 to 319 show protein characterisation data for various proteins of the 
invention. 

Figures 189 to 237 and 240 to 246 show SDS-PAGE analysis of purified GBS proteins of the 
40 inventioa The left-hand lane contains molecular weight maifeers. These are 94, 67, 43, 30, 20.1 & 14.4 
ma 
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MODES FOR CARRYING OUT THE INVENTION 

The following examples describe nucleic add sequences which have been identified in Streptococcus, 
along with their inferred translation products. The examples are generally in fee following format: 

• a nucleotide sequence which has been identified in Streptococcus 
5 • the inferred translation product of this sequence 

• a computer analysis (e.g. PSORT output) of the translation product, indicating antigenicity 

Most examples describe nucleotide sequences j&om SMgalactiae, The specific strain which was 
sequenced was fiom serotype V, and is a clinical strain isolated in Italy which expresses the R antigen 
(ISS/Rome/Italy collection, strain.2603 V/R). For several of these examples, the corresponding 
10 sequences from S.pyogenes are also given. Where GBS and GAS show homology in this way, there is 
conservation between species which suggests an essential fvmction and also gives good cross-species 
reactivity. 

In contrast, several examples describe nucleotide sequences firom GAS for which no homolog m GBS 
has been identified This lack of homology gives molecules which are usefid for distinguishing GAS 
15 fiom GBS aud for making GAS-specific products. The same is trae for GBS sequences which lack 
GAS homologs e.g. &ese are usefiil for making GBS-specific products- 

The examples typically include detadls of homology to sequences in the public databases, Protems that 
are similar in sequence are generally similar in both structure and fijnctiaD, and the homology ofien 
indicates a common evolutionary origm. Comparison with sequences of proteins of known fiinction is 
20 widely used as a guide for &e assignment of putative protein fimction to a new sequ^ce and has proved 
particularly usefiil in whok-gpnome analyses. 

Various tests can be used to assess the in vivo immunogenicity of the proteins identified in the exairqples. 
For exaniple, die proteins can be expressed recombinantly and used to screen patient sera by 
immunoblot A positive reaction between the protein and patient serum indicates that the patient has 
25 previously mounted an iinrnune response to tiKs protein in question j^^^ the protein is an immunogen. This 
method can also be used to identify immunodominant proteins. The mouse model used in tiie exanq>les 
can also be used. 

The recombioant protein can also be conveniently used to prepare antibodies e.g. in a mouse. These can 
be used for direct confirmation that a protein is located on flie cell-surface Labelled antibody (e.g. 
30 fluorescent labeUing for FACS) can be incubated witii tatact bacteria and tiie presence of label on the 
bacterial surfece confirms the location of the protein. 

For many GBS proteins, the following data are givea: 
__ SDS-PAGE aoalyds of total recombinant E.coli cell extracts for GBS protein ejtpression 
- SDS-PAGE analyas after the protein purification 
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- Western-blot analysis of GBS total cell extract using antisera raised against recombinant proteins 

- FACS and EUSA analysis against GBS tising antisera raise against recombinant proteins 

- Results of the in vivo passive protection assay 

D^s of e^)raimental techniqaes used are presented below: 
S Sequence analysis 

Open reading frames (ORFs) within nucleotide sequences were predicted using the GLIMMER program 
[Salzberg et d. (1998) Nudeic Adds Res 26:544-8]. Where necessary, start codons were modified and 
conected manually on Ihe basis of flie presence of ribosome-Wnding sites and promoter regions on the 
iq)stteam DNA sequaice. 

10 ORFs were then screened against the non-redundant protein databases using the programs BLASTp 
[Altschul et al. (1990) J. Mol. Biol. 215:403-410] and PRAZE, a modification of the Smilh-Walmnan 
algorithm [Smith & Waterman (1981) J Mol Biol 147:195-7; see Heischmann et al (1995) Science 
269:496-512]. 

Leader pt^rtides within the ORFs were located using three different approaches: (i) PSORT [Nakai 
15 (1991) Bull. Inst. Chem. Res.. Kyoto Univ. 69:269-291; Horton & Nakai (1996) Intellig. Syst. Mol. Biol 
4:109-115; Horton & Nakai (1997) Intellig. Syst. Mol Biol 5:147-152]; (ii) Signal? [Nielsen & Krogh 
(1998) in Proceedings of the Sixth International Conference on Intelligent Systems for Molecular 
Biology (ISMB 6), AAAI Press, Menlo Park, Cahfomia, pp. 122-130; Nielsen et al (1999) Protein 
Engineering 12:3-9; Nielsen a/. (1997). Int. J. Neural Sys. 8:581-599]; and (iii) visual inspection of the 
20 ORF sequences. Where a signal sequences is given a "possible site" value, the vahie represents Ihe 
C-terminus residue of the signal peptide e.g. a "possible site" of 26 means lhat the signal sequence 
consists of amino adds 1-26. 

Lipoprotein-specific signal peptides were located using three different approaches: (i) PSORT [see 
above]; (ii) the "prokaryotic membrane Upoprotein Upid attachment site" PROSITE motif [Hofinann et 

25 al. (1999) Nucleic Acids Res. 27:215-219; Bucher & Bairoch (1994) in Proceeding!! 2nd International 
Conference on Intelligent Systems for Molecular Biology (ISMB-94). AAAI Press, pages 53-61]; and 
(iii) flie FINDPATTERNS program available in the GCG Wisconan Package, usuig the pattern 
. (M,L,V)x{9,35}liXxCx. 

Transmembrane domains were located using two approaches: (i) PSORT [see above]; (ii) TopPied [von 

30 Heijne (1992) J. Mol. Biol 225:487-494]. 

LPXTG motife, characteristic of ceH-wall attadied proteins in Gram-positive bacteria [Fischetti et al. 
(1990) Mol Microbiol 4:1603-5] were located wifli FINDPATTERNS using flie pattern 
(L,I,V,M,y,F)Px{T,A,S,G) (6,N, S,T,A,L) . 
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RGD motifs, characteristic of ceU-adhesion molecules [D'Souza et al. (1991) Trends Biochem Sci 
16:246-50] were located using FINDPATTERNS. 

Enzymes belonging to fbe glycolytic pathway were also selected as antigens, because these have been 
found experimentally expiessed on flie surfece of Streptococci [e.g. PanchoM & Fischetti (1992) J Bp 
5 Med 176:415-26; PandioH & Fisch€*li (1998) J Biol Chem 273:14503-15]. 

Cloning, expression and purification of proteins 

GBS genes were cloned to fadUtate expression in E.coli as two different types of fusioo proteins: 

a) proteins having a hexa-histidine tag at the amino-temmmsCHis-gbs) 

b) pn)teins having a GST fusion partner at the amino-terminus(Gst-gbs) 

10 Cloning was performed using the Gateway™ ledmology (Life Technologies), which is based on the site- 
specific reconAination reactions that mediate integration and excision of phage lambda into and from flie 
Kcoli genome. A single cloning experiment mctuded the followmg steps: 

1- Amplification of GBS chromosomal DNA to obtain a PCR product coding for a single ORF 
flanked by at(B recombination sites. 
15 2- Insertion of the PCR product into a pDONR vector (contaminga«P sites) through a BP reaction 

(attB X a«P sites). This reaction gives a so called 'pEntry' vector, which now contains attL sites 
flanlcing the insert 

3- InsCTtion of flie GBS gene into E,coli ejq«Bssion vectors (pDestination vectors, containing attK 
sites) Ihioi:^ a LR reaction between pBntry and pDestination plasnrids (attL x atfR sites). 

20 A) Chromosontal DNA preparation 

For chromosomal DNA preparation, GBS strain 2603 V/R (Istituto Siq»eriore Saniti. Rome) was grown 
to exponential phase in 2 Ktres TH Broth (EHfco) at 3TC, harvested by cenlrifugation, and dissoWed in 
40 ml TBS (50 mM Tris pH 8, 5 mM EDTA pH 8, 20% sucrose). After addition of 2.5 ml lysozyme 
sohition (25 mgM in TES) sad 0.5 ml rautanolysin (Sigma M-9901, 25000U/ml in IfcO). the suspension 

25 was incubated at 37°C for 1 hour. 1 ml RNase (20 mg/ml) and 0.1 ml proteinase K (20 mg/ml) were 
added and iitcabation was contmiud for 30 noin. at 37''C. 

CeU lysis was obtained by adding 5 ml saiicosyl sohition (10% N-laurylsarcosme in 250 mM EDTA pH 
8.0), and incubating 1 hoar at 37°C wifli fiequent inversiraL After sequential extraction with phenol, 
phenol-chloroform and chlorofonn, DNA was precipitated widi 0.3M sodium acetate pH 5.2 and 2 
30 volumes of absolute e&anoL The DNA pellet was rinsed wifii 70% ethanol and dissolved in TE buffer 
(10 mM Tiis-HCl, 1 mM EDTA, pH 8). DNA concentration was evaluated by ODaio. 
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B ) Oligonucleotide desisn 

Synflietic oligonucleotide primers were designed on the basis of the coding sequence of each OSF. The 
aim was to express the protein's extracellular region. Accordingly, predicted signal peptides were 
omitted (by deducing Ihe 5' end amplification primer sequence immediately downstream fixm the 
5 predicted leader sequence) and C-tetminia cell-waU ancoring regions were removed (e.g. LPXTG motife 
and downstream amino adds). Where additional nucleotides have been deleted, (his is indicated by fhe 
suffix 'd' (e.g. GBS352d' - see Table V). Conversely, a suffix 'V refers to expression without these 
deletions. Deletions of C- or N-teraiinal residues were also sometimes made, as indicated by a 'C or 'N' 
suffix. 

10 The amino acid sequences of the expressed GBS proteins (includmg 'd' and 'L' forms etc) are 
definitively defined by the sequences of the oligonuclotide primers given in Table 11. 

5' tails of forward primers and 3' tails of reverse primers included a«Bl and a«B2 sites respectively: 

Forward primers: 5'-GGGGACAAGTTTGTACAAAAAAGCAGGCTCT-0RF in ftame-3' (the TCT 
sequence preceding the ORF was omitted when fee ORF's first coding triplet began wifli T). 

15 Reverse primers: y-GGOSACCACTTTGTACAAGAAAGCTGGGTT-ORF reverse complement-3'. 

The number of nucleotides which hybiidizBd to flie sequence to be amplified depended on the melting 
tenqjerature of the primers, which was determined as described by Breslauer et al. [PNAS USA (1986) 
83:3746-50]. The average melting temperature of the selected oUgos was 50-55°C for the hybridizing 
region and 80-85°C for the whole oli^. 

20 D AmpMcation 

The standard PGR protocol was as follows: 50 ng genomic DNA were used as template in fee presence 
of 0.5 jiM each primer, 200 ]iM each dNTP, 1.5 mM MgCt, Ix buffer minus Mg"* (Gfl)co-BRL) and 2 
units of Taq DNA polymerase (Platinum Taq, GSbco-BRL) in a final votame of 100 jil. Each sample 
underwent a double-step of an^Hficatfon: 5 cycles performed using as the hybridizing temperature 50°C, 
25 followed by 25 cycles at 68''C. 

The standard cycles were as follows: 

Denatutation: 94°C, 2 min 

5 cycles: Denatutation: 94°C, 30 seconds 
HybridizatiOT: SO^C, 50 seconds 
30 Hongation:72''C, Imin.or2min.and40sec. 

25 cycles : Denatutation: 94''C, 30 seconds 
Hybridization: 68°C, 50 seconds 
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Elongation: 72''C, 1 min. or 2 min. and 40 sec. 

Elon^on time was 1 minute for ORFs shorter than 2000bp and 2:40 minutes for ORFs longer than 
2000bp. An5)lifications were performed using a Gene Amp PGR system 9600 (Perkin Elmer). 

To check amplification results, ^1 of each PGR product were loaded onto 1-1.5 agarose gel and the 
size of anq)lified fragments was conq)ared with DNA molecular weight standards QMA marker DC 
Roche, Ikb DNA ladder Biolabs). 

Single band PGR products were purified by PEG precipitation: 300 ^1 of TE buffer and 200 ^il of 30% 
PEG 8000/30 mM MgGt were added to 100 M'l PGR reaction. After vortexing, the DNA was centiifuged 
for 20 min at lOOOOg, washed with 1 vol. 70% ethauol and the pellet dissolved in 30 nl TE. PGR 
products smaller than 350 bp were purified using a PGR purification Kit (Qiagen) and eluted with 30 ]il 
of the provided elution buffer. 

In order to evaluate the yield, 2^1 of the purified DNA were subjected to agarose gel electrophoresis and 
compared to titrated molecular weight standards. 

D) Cloning of PCR products into expression vectors 

Cloning was perfi>rmed following flie Gateway™ technology's "one-tube protocol", which consists of a 
two step reaction (BP and LR) for direct insertion of PGR products into expression vectors. 

BP reaction (attB x attP sites): The reaction allowed insertion of the PGR product into a pDONR 
vector. The pDONR'^^ 201 vector we used contains flie killer toxm gene ccdB between atiPl and a«P2 
sites to minimize background colonies lacking the PCR insert, and a selectable maricer gene for 
kanamycin resitance. The reaction resuhed in a so called pEatry vector, in which the GBS gene was 
located between a«Ll and attLl sites. 

60 finol of PGR product and 100 ng of pDONR™ 201 vector were incubated with 2.5 \i\ of BP 
clonase™ in a final vohme of 12.5 ^1 for 4 hours at 25°G. 

LR reaction {aUL x ottBi sites): The reaction allowed tiie insertion of the GBS gene, now present in the 
pEntry vector, into RcoU expression vectors (pDestination vectors, containing attR sites). Two 
pDestination vectors were used (pDEST15 for N- terminal GST fusions - Figure 86; and pDEST17-l 
for N-terminal His-tagged fiisions - Figure 87). Both allow transcription of the ORF fusion coding 
mRNA under T7 RNA polymerase promoter [Studier et al (1990) Meth. Enzymol 185: 60^. 

To 5 jil of BP reaction were added 0.25 jil of 0.75 M NaCl, 100 ng of destination vector and 1,5 nl of 
LR clonase™ . The reaction was incubated at 25X for 2 hours and stopped with 1 jil of 1 mgM 
proteinase K solution at 37°G for 15 miiL 
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1 III of the completed reaction was used to transfonn 50 fil electrocoirpetent BL21-SI™ cells (0.1 cm, 
200 ohms, 25 BL21-SI cells contain an integrated T7 RNA polymerase gene imder the control of 
ihe salt-inducible prU promoter [Gowrishankar (1985) J. Bacteriol 164:434^. After electroporation 
cells were dfluted in 1ml SOC medium (20 g/1 bacto-tryptone, 5 gA yeast extract, 0.58 g/1 NaCl, 0.186 g/1 

5 KCl, 20 mM glucose, 10 mM MgCt) and incubated at 37°C for 1 hour. 200 fil cells were plated onto 
LBON plates (Luria Bio& medium without NaCl) containing 100 ^ig/ ml anq)icil]in. Plates were then 
incubated for 16 hours at 3TC. 

Entry clones: In order to allow the future preparation of Gateway compatible pEntty plasmids 
containing genes which might tum out of interest after immunological assays, 2.5 \x\ of BP reaction were 
10 incubated for 15 min in the presence of 3 jil 0.15 mg/ml proteinase K sohition and then kept at -20*'C. 
The reaction was in this way available to trEmsform E.coli corr5)etent cells so as to produce Entry clones 
for future intnxhiction of the genes in other Destmation vectors. 

E) Protein expression 

Single colonies derived fcom the transformation of LR reactions were inoculated as small-scale cultures 
15 in 3 ml LBON 100 ^g/ml ampicillin for overnight growth at 25''C. 50-200 |il of the culture was inoculated 
in 3 ml LBON/Amp to an initial OD600 of 0.1. The cultures were grown at 3TC until OD600 0.4-0.6 
and recombinant protein expression was induced by adding NaCl to a final concentration of 0.3 M. After 

2 hour incubation the final OD was checked and the cultures were cooled on ice. 0.5 OD600 of cells were 
harvested by centrifugation. The cell pellet was suspended in 50 \i\ of protein Loading Sample Buffer (50 

20 mM TRIS-HCl pH 6.8, 0.5% w/v SDS, 2.5% v/v glycerin, 0.05% w/v Bromophenol Bhie, 100 mM 
DTT) and incubated at 100 °C for 5 min. 10 fil of sample was analyzed by SDS-PAGE and Coomassie 
Blue staining to vaify the presence of induced protein band 

F) Purification of the recombinant proteins 

Single colonies were inoculated in 25 ml LBON 100 iigfwl ampicillin and grown at 25°C overnight. The 
25 ovemi^t culture was inoculated in 500 ml LBON/amp and grown under shaking at 25 °C until OD^qq 
values of 0.4-0.6. Protein expression was flien induced by adding NaCl to a final concentration of 0.3 M. 
After 3 hours incubation at 25 ""C the final ODgoo was checked and the cultures were cooled on ice. After 
centrifiigation at 6000 rpm (JAIO rotor, Beckman) for 20 min., the cell pellet was processed for 
purification or frozen at -20 °C. 

30 Proteins were purified in 1 of 3 ways depending on the fiision partner and the protein's solubility: 
Purification of sohible Efis-tagged proteins fcom Rcoli 

1. Transfer pellets from -20^C to ice bath and reconstitute each pellet with 10 ml B-PER™ solution 
(Bacterial-Protein Extraction Reagent, Pierce cat. 78266), 10 jJ of a 100 mM MgCb solution, 50 
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of DNAse I (Sigma D-4263, 100 Kunits in PBS) and 100 \il of 100 mg/jxA lysozyme in PBS 
(Sigma L-7651, final concentration 1 mg/ml). 

2. Transfer lesuspended pellets in 50 ml centrifuge tubes and leave at room temperature for 30-40 
minutes, vortexing 3-4 times. 

3. Centrifuge 1 5-20 minutes at about 30-40000 x g, 

4. Prepare Poly-Prep (Bio-Rad) columns containing 1 ml of Fast Flow Ni-activated Chelating 
Sepharose (Pharmacia). Equilibrate with 50 mM phosphate buffer, 300 mM NaCl, pH 8.0. 

5. Store the pellet at -20°C, and load the supernatant on to the cohnnns. 

6. Discard the flow flnough. 

7. Wash wiih 10 ml 20 mM imidazole buffer, 50 mM phosphate, 300 mM NaQ, pH 8.0. 

8. Ehitetheproteimboundtothecoluinmwi1h4.5ml(1.5nd+ 1.5ml+1.5ml)2^^ 

buffer, 50 mM phosphate, 300 mM NaCl, pH 8.0 and collect three j&actions of -1.5 ml each. Add 
to each tube 15 pi DTT 200 mM (final concentration 2 mM). 

9. Measure the protein concentration of the collected fi:actions with the Bradford method and analyse 
the proteins by SDS-PAGE. 

10. Store the collected fiacdons at 44*'C while waiting for the results of flie SDS-PAGE analysis. 

11. For immunisation prepare 4-5 aliquots of 20-100 \ig each m 0.5 ml in 40% glycerol. The dihition 
buffer is the above elution buffer, plus 2 mM DTT. Store the aliquots at -20°C until immunisation. 

Purification of His-tag^ed proteins fh)m inclusion bodies 

1. Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 
-20°C- Transfer the pellets from -20°C to room temperature and reconstitute each pellet with 10 
ml B-PER™ solution, 10 ^il of a 100 mM MgCt solution (final 1 mM), 50 ^1 of DNAse I 
equivalent to 100 Kunits units in PBS and 100 ^1 of a 100 mg/ml lysozime (Sigma L-7651) solution 
in PBS (equivalent to 10 mg, final concentration 1 mg/ml). 

2. Transfer the resuspended pellets in 50 ml centrifiige tubes and let at room temperature for 30-40 
minutes, vortexing 3-4 times. 

3. Centrifiige 15 minutes at 30-4000 x g and collect the pellets. 

4. Dissolve the pellets wifli 50 mM TRIS-HCl, 1 mM TCEP {Tris(2-carboxyetiiyl)-phosphine 
hydrochloride. Pierce} , 6M guanidine hydrochloride, pH 8.5. Stir for ^ 10 min. with a magnetic 
bar. 

5. Centrifuge as described above, and coDect the supernatant. 

6. Prepare Poly-Prep ^io-Rad) columns containing 1 ml of Fast Flow Ni-activated Chelating 
Sepharose (Pharmacia). Wash the columns twice with 5 ml of IfcO and equilibrate with 50 mM 
TRIS-HCl, 1 mM TCEP, 6M guanidine hydrochloride, pH 8.5. 
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7. Load the supematants from step 5 onto the cohiims, and wash with 5 ml of 50 rsM TRIS-HCl 
buffer, 1 mM TCEP, 6M uxea, pH 8.5 

8. Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HCl , 6M urea, 1 iriM TCEP, 
pH 8.5. Collect and set aside the first 5 ml for possible fbrfher controls. 

9. Elute proteins bound to columns with 4.5ml buffer containing 250 mM imidazole, 50 mM TRIS- 
HCl, 6M urea, 1 mM TCEP, pH 8.5. Add the elution buffer in Ifaree 1.5 ml aliquots, and collect 
the corresponding ttiree fractions. Add to each fraction 15 jd DTT (jSnal concentration 2 mM). 

10. Measure eluted protein concentration with Bradford method and analyse proteins by SDS-PAGE. 

11. Dialyse overnight the selected fraction against 50 mM Na phosphate buffer, pH 8.8, containing 
10% glycerol, 0.5 M arginine, 5 mM reduced glutathione, 0,5 mM oxidized glutathione, 2 M urea 

12. Dialyse against 50 mM Na phosphate buffer, pH 8.8, containing 10% glycerol, 0.5 M arginine, 5 
mM reduced glutathione, 0.5 noM oxidized glutathione. 

13. Clarify the dialysed protein preparation by centrifugation and discard the non-soluble material and 
measure the protein concentration with the Bradford method. 

14. For each protem destined to the immunization prepare 4-5 aliquot of 20-100 [xg each in 0.5 ml 
after having adjusted the glycerol content up to 40%. Store the prepared aliquots at -20° C until 
immunization. 

Purification of GST-fiision proteins from RcoU 

1. Bacteria are collected from 500 ml cultures by centrifiigation. If required stoie bacterial pellets at 
-20X, Transfer the pellets from -20X to room temperature and reconstitute each pellet with 10 
ihl B-PER™ solution, 10 ^1 of a 100 mM MgCl2 solution (final 1 mM), 50 of DNAse I 
equivalent to 100 Kimits units in PBS and 100 ^1 of a 100 mg/ml lysozime (Sigma L-7651) solution 
in PBS (equivalent to 10 mg, final concentration 1 mg/ml). 

2. Transfer the resuspended pellets in 50 ml centrifuge tubes and let at room temperature for 30-40 
minutes, vortexing 3-4 times. 

3. Centrifuge 15-20 minutes at about 30-40000 x g. 

4. Discard centrifiigation pellets and load supematants onto the chromatogr^hy columns, as 
follows. 

5. Prepare Poly-Prep (Bio-Rad) columns containing 0.5 ml of Glutathione-Sepharose 4B resiiL Wash 
tiie cohimns twice with I ml of H2O and equilibrate with 10 nd PBS, pH 7.4. 

6. Load supematants on to the cotanms and discard the flow through. 

7. Wash the columns with 10 ml PBS, pH 7.4. 

8. Elute proteins bound to columns with 4.5 ml of 50 mM TRIS buffer, 10 mM reduced glutaftione, 
pH 8.0, adding 1.5 ml + 1.5 ml + 1.5 ml and collecting the respective 3 fractions of -'l.S ml each. 
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9. Measure protein concentration of the fractions with the Bradford method and analyse the proteins 
by SDS-PAGE. 

10. Store the collected fractions at +4''C while waiting for the results of the SDS-PAGE analysis. 

11. For each protein destined for immunisation prepare 4-5 aliquots of 20-100 ]ig each in 0.5 ml of 
5 40% glycerol. The dilution buffer is 50 mM TRIS-HCl, 2 mM DTT, pH 8.0. Store the aliquots at 

-20°C until immunisatiorL 

Figures 167 to 170 and 238 to 239 

For the experiments shown in Figures 167 to 170, Figure 238 and lanes 2-6 of Figure 239, the GBS 
proteins were fused at the N-terminus to thioredoxin and at C-terminus to a poly-BBs tail. The plasnud 

10 used for cloning is pBAD-DEST49 (Invitrogen Gateway™ technology) and expression is under the 
control of an L(+)-Arabinose dependent promoter. For the production of these GBS antigens, bacteria 
are grown on RM medhim (6g/l Na2HP04, 3g/l KH2PO4, 0.5 g/l NaQ, 1 g/l NH4CI, pH7,4, 2% 
casaminoacids, 0.2 % glucose, 1 mM MgCi) containing 100 pg/ml ampicillin. After incubation at 37°C 
until cells reach ODboo=0-5, protein expression is induced by adding 0.2% (v/v) L(+)Arabinose for 3 

15 hours. 

Immunisations with GBS proteins 

The purified proteins were used to immunise groups of four CD-I mice mtraperitoneally. 20 \ig of each 
purified protein was injected in Freund's adjuvant at days 1, 21 & 35. Immune responses were 
monitored by using samples taken on day 0 & 49. Seia were analysed as pools of sera 60m each group 
20 of mice. 

FACScan bacteria Binding Assay procedure. 

GBS serotype V 2603 V/R strain was plated on TSA blood agar plates and incubated overnight at 'iTC, 
Bacterial colonies were collected from the plates using a sterile dracon swab and inoculated into 100ml 
Todd Hewitt Broth. Bacterial growth was monitored every 30 minutes by following ODeoo- Bacteria were 
25 grown until OD^oo = 0.7-0.8. The culture was centrifiiged for 20 minutes at 5000rpm. The supernatant 
was discarded and bacteria were washed once with PBS, resuspended in V^ culture volume of PBS 
containing 0.05% paraformaldehyde, and incubated for 1 hour at 37°C and then overnight at 4''C. 

50ill1 bacterial ceDs (ODkoo 0.1) were washed once with PBS and resuspended in 20fil blockmg serum 
(Newborn Calf Serum, Sigma) and incubated for 20 minutes at room tmiperature. The cells were then 
30 incubated with lOOjil dihited sera (1:200) in dihdion buffer (20% Newborn Calf Serum 0.1% BSA in 
PBS) for 1 hour at 4*C. CeUs were centrifiiged at SOOOrpm, the si5)ematant aspirated and cells washed 
by adding 200|il washing buffer (0.1% BSA in PBS). 50|xl R-Phicoerytrin conjugated F(ab)2 goat anti- 
mouse, diluted 1:100 in dilution buffer, was added to each sample and incubated for 1 hour at ^C. Cells 
were spun down by centrifiigation at 500Qrpm and washed by adding 200p.l of washing buffer. The 
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supernatant was aspirated and cells resuspended in 200^1 PBS. Samples were transferred to FACScan 
tubes and read. The condition for FACScan setting were: FL2 on; FSC-H thiBshold:54; FSC PMT 
Voltage: E 02; SSC PMT: 516; Amp. Gains 2.63; FL-2 PMT: 728. Compensation values: 0. 

Saii5»les wae considered as positive if they had a A mean values > 50 channel values. 
Whole Retracts preparation 

GBS serotype m COHl strain and serotype V 2603 V/R strain cells were grown overnight in Todd 
Hewitt Broth. Itnl of the culture was inoculated into 100ml Todd Hewitt Broth. Bacterial growth was 
monitored every 30 minutes by following ODm- The bacteria were grown until the OD reached 0.7-0.8. 
The cuftme was centrifnged for 20 minutes at 5000 rpm. The siq)ematant was discarded and bacteria 
were washed once with PBS, resuspended in 2ml 50mM Tris-HCl. pH 6.8 adding 400 units of 
Mutanolysin (Sigma-Aldrich) and incubated 3 hrs at 37''C. After 3 cycles of fteeze^w. celblar debris 
were removed by centri&gation at 14000g for 15 minutes and the protein concentration of the 
supernatant was measured by the Bio-Rad Protein assay, using BSA as a standard. 

Western blotting 

Purified proteins (50ng) and total cell extracts (25^g) derived fixMn GBS serotype HI COHl stram and 
serotype V 2603 V/R strain were loaded on 12% or 15% SDS-PAGE and transferred to a nitroceMose 
membrane. The transfer was performed for 1 hours at lOOV at 4'C, in transferring buffer (25mM Tris 
base, 192mM glycine, 20% methanol). The membrane was saturated by overnight incubation at 4'C in 
saturation buffer (5 % skimmed milk, 0.1% Tween 20 in PBS). The membrane was incubated for 1 hour 
at room temperature wifli 1:1000 mouse sera dihited in saturation buffer. The membrane was washed 
twice with washing buffer (3 % skimmed milk, 0.1% Tween 20 in PBS) and incubated for 1 hour with a 
1:5000 ditation of horseradish peroxidase labelled anti-mouse Ig (Bio-Rad). The membrane was washed 
twice with 0.1% Tweai 20 in PBS and developed with tibe Opti-4aN[ Substrate Kit (Bio-Rad). The 
reaction was stopped by adding water. 

Unless otherwise faidicated, lanes 1, 2 and 3 of blots in Ihe drawings are: (1) the purified protdn; (2) 
GBS-m extracts; and (3) GBS-V extracts. Molecular weight markers are also shown. 

In vivo/>flssive protection assay in neonatal sepsis mouse model 

The immune sera collected fi^om the CDl immunized mice were tested in a mouse neonatal sepsis model 
to verify their protective efficacy in mice challenged Avi& GBS serotype HI. Newborn Balb/C Uttermates 
were randomly divided in two groups within 24 hrs fi»m birfli and injected subcutaneously with 25nl of 
diluted sera (1:15) from immunized CDl adult mice. One group received preimnnine sera, flie other 
received immune sera. Four hours later aU pups were chaHenged with a 75% leflial dose of the GBS 
serotype IE COHl straia. The challaige dose obtained dihiting a mid log phase culture was administered 
subcutaneously in 25 ^il of saline. The number of piq>s surviving GBS infection was assessed every 12 
hours for 4 days. Results are in Table HI. 
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Example 1 

A DNA sequence (GBSxl402) was identified in S.agalactiae <SEQ ID 1> which encodes the amino add 
sequence <SEQ ID 2>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

Seems to have an imcleavable N-term signal seq 
INTEGRAL Likelihood « -0-48 Transmeiribrane 169 - 185 ( 169 - 185) 

Final Results 

bacterial membrane — - Certainty-0 . 1192 (Affirmative) < suco 

bacterial outside Certainty=0- 0000 {Not caear) < suco 

bacterial cytoplasm --- Certainty=Q. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB88235 GB:AL353012 hypothetical serine-rich repeat protein 
[Schizosaccharorayces ponibel 
Identities - 41/152 (26%) , Positives = 75/152 (48%) , Gaps « 4/152 (2%) 

Query: 22 SSIGYADTSDKNTDTSVVTTTLSEEKRSDEI£)QSSTGSSSE^ 81 

SS +++S +++D+S ++ E S+ D SS+ SSSE+ESSS ++ S++ + 

Sbjct: 132 SSDSESESSSEDSDSSSSSSDSESESSSEGSDSSSSSSSSESESSSEDNDSSSSSSDSES 191 

Query: 82 TEPSQPSPSEENKPDGRTKTE---IGNNKDISSGTK^ISEDSIK^ 138 

S+ S S + D +++ ++ SS SED+ + S + S+ E D 

Sbjct: 192 ESSSEDSDSSSSSSDSESESSSEGSDSSSSSSSSESESSSEDNDSSSSSSDSESESSSED 251 

Query: 139 ESSSSKANDGK-KGHSKPKKELPKTGDSHSDT 169 

SSS ++D + + SK + DS D+ 

Sbjct: 252 SDSSSSSSDSBSESSSKDSDSSSNSSDSEDDS 283 

There is also homology to SEQ ID 1984. 

A related GBS gene <SEQ ID 8785> and protein <SEQ ID 8786> were also identified. Analysis of this 
protein sequence reveals the following: 

Lipop: Possible site: -1 Crend: 5 
McG: Discrim Score: 6.72 
GvH: Signal Score (-7.5): -4.34 

Possible site: 27 
»> Seems to have an uncleavable N-tem signal seq 
ALOM program count: 1 value: -0.48 threshold: 0.0 

IMTBGRAL Likelihood = -0,48 Transmembrane 169 - 185 { 169 - 185) 
PBRIFHBRAL Likelihood =0.16 7 
modified ALOM score: 0.60 

*** Reasoning Step: 3 



Final Results 

bacterial membrane — Certainty^O . 1192 (Affirmative) < suco 

bacterial outside Certaintyt^O. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty-0 . 0000 (Not Clear) < suco 

LPXTG motif: 159-163 

SEQ ID 2 (GBS4) was expressed in Kcoli as a GST-fiision product. SDS-PAGE analysis of total ceU 
extract is shown in Figure 9 QmQ 3; MW 43.1kDa) and Figure 63 (Ime 4; MW 50kDa). It was also 
expressed in Kcoli as a His-fusion product SDS-PAGE analysis of total cell extract is shown in Figure 12 
(lane 7; MW 30kDa), Figure 63 (lane 3; MW 30kDa) and in Figure 178 (lane 3; MW 30kDa). 

GBS4-GST was purified as shown in Figure 190 (lane 6) and Figure 209 (}me 8). 
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Purified GBS4-His is shown in Figures 89A, 191 Qane 10), 209 Gane 7) and 228 Oanes 9 & 10). 
The purified GBS4-ffis fiision product was used to immunise mice (ianQ 2 product; 20ig/mouse). The 
resulting antiserum was used for Western blot (Figure 89B), FACS, and in the in vivo passive protection 
assay (Table IH). These tests confirm that the protein is immunoaccessible on GBS bactma and that it is an 
5 effective protective immunogpn. 

Based on this analysis, it was predicted that Ihese proteins and their epitopes could be usefid antigens for 
vaccines or diagnostics. 

Example 2 

A DNA sequence (GBSxllOO) was identified in S.agalactiae <SEQ ID 3> which encodes the amino add 
10 sequence <SEQ ID 4>. This protein is predicted to be aggregation promoting protein. Analysis of this 
protein sequence reveals the following: 

Possible site: 33 

>» Seems to have a cleavable N-term signal seq. 

15 Final Results , ^ 

bacterial outside Certainty^0.300D{Affarniatiye) < suco 
bacterial mentorane -~ Certainty=0 . 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty-0 . 0000 (Not Clear) < suco 

20 The protein has homology with tiie following sequences in the GENPEPT database. 

>GP-CAA69725 GB:Y08498 aggregation promoting protein [Lactobacillus gasseril 
Identities = 56/103 (54%) , Positives = 69/103 (66%) , Gaps = 5/103 (4%) 

Query 82 TASQAEAKSQPT lENSMNSSSNI£SSDSAAKEElJWa?ESl«3STO^^ 136 

25 TSAA+QT + + + +NS S++AAK +A RES G Y+A NGQY 6+YQ 

Sbjct: 195 TYSlZASAQKarTQVAQKTO^ 254 

Query: 137 LSQSYI^KSDI^PENQBKVAimWSRYGSWSAALSFWNSNGWY 179 
LS SYL 6D S NQE+VRDNYV SRY6SW+ A FW +NGWY 
30 Sbjct: 255 I^ASYLQGDySftANQERVADNYVKSRYGSWTGftQK^^ 297 

No corresponding DNA sequence was identijSed in S,pyogenes. 

A related GBS gene <SEQ ID 8709> and protein <SEQ ID 8710 were also identified. Analysis of Ibis 
protein sequence reveals the following: 

35 Xiipqp: Possible site: -1 Crend: 9 

McG: Discrim Score: 2.59 
GvH: Signal Score (-7.5): -0.42 

Possible site: 33 
»> Seems to have a cleavable N-tenn signal seq. 
40 ALOM program count: 0 value: 6.79 threshold: 0.0 

PEailPHBRAIi Likelihood = 6.79 59 
modified ALOM score: -1.86 



45 



50 



*** Reasoning Step: 3 



Final Results 

bacterial outside Certainty^O. 3000 (Affirmative) < suco 
bacterial membrane — Certainty-0. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty^O. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

57.5/71.3% over 92aa 
Lactobacillus geisseri 
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EGADl 154417 1 aggregation promoting protein Insert charac^^ 

GP|l6l9598|eiiibICRA69725.l||Y08498 aggregation promoting protein Insert characterxzed 

ORP01056(547 - 837 of 1137) , ^ w 4ti « 

EGADl 154417 1 164788 (205 - 297 of 297) aggregation promoting protein {Lactobacillus 
gasseri}6pil619598|einb|C2\A69725.l| 1Y08498 aggregat 
ion promoting protein {Lactobacillus gasseri} 
%Matcla =14.6 

%Identity =57.4 %Similarity « 71.3 

Matches = 54 Mismatches = 26 Ccaiservative Sub.s = 13 

507 537 567 597 627 657 687 717 

SIJffSISNMmSI(am.KIJ)NSTASQAEAKSQPTim^ 

NTORTY 

200 210 220 230 240 250 

747 777 807 837 867 897 927 957 

SYtM3DI£PENQKKVftDNYVVSRYGSWSaAI£FVWSl^ 

III II i |||:|illlt Illlll^ I II ^llll 

270 280 290 

A related GBS gene <SEQ ID 871 1> and protein <SEQ ID 8712> were also identified. Analysis of this 
protein sequence reveals the following: 

Lipop: Possible site: -1 CreiKa: 9 
McG : Discrim Score : 2.59 
GvH: Signal Score (-7.5): -0.42 

Possible site: 33 
>» Seems to have a cleavable N-term signal seq. 
ALOM program count: 0 value: 6.79 threshold: 0.0 
PERIPHERAL Likelihood » 6.79 59 
modified ALOM score: -1.86 

*** Reasoning Step: 3 

Final Results , ^. v 

bacterial outside Certainty^O. 3 000 (Affirmative) < suco 
bacterial tnenibrane — Certainty^O. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty^ 0.0000 (Not Clear) < suco 

The protein has homology with the foUowing sequences in the databases: 

44.0/62.0% over 115aa Bacillus subtilis 

BGRD|108478| hypothetical protein Insert characterized OMNXlNTOlBSllOO p60-related 

protein Insert characterized • ^ 

GP|2226145|^lCAA74437.l|lY14079 hypothetical protein Insert characterized 
QP|2633272leihb|caB12776.lIlZ99109 siinilar to cell wall-binding protein Insert 

cdiaracterized • ^ 

PIr1B69825|B69825 cell wall-binding protein homolog ybdD - Insert characterized 

31^78 18^0936(5? -^172 of 488) hypothetical protein {Bacillus subtilis}OMNI |NT01BS1100 
p60-related proteinGP 1 2226145 |ento|CAA7443 7.1 1 1Y14079 hypothetical protein Bacillus 
subtilis}GP|2633272leinblCAB12776.ll|Z99109 similar to cell wall-binding protein {Bacillus 
subtilis}PIR|B69825|B69825 cell wall-binding protein homolog yhdD - Bacillus subtilis 
%Match 9.0 

%Identity =44.0 %Siinilarity =62.0 

Matches = 44 Mismatches = 35 Conservative Sub.s = 18 

120 150 180 210 240 270 300 330 

*DQFMVLZ^SFI*CEKLNNFT*RKIJCIV^ 

MKKKLAAGLTASAIVGTTLVVTPAEAATIKV^^ 

10 20 30 40 50 
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360 390 435 465 495 525 

PFKIiGVaSIiLVGASIALPIiSVSaAS YTVKSGDTIiSAIAKlOTKTTVQELVSLNS 

I 1 :1 :| :1 1 I: I IIIIIlM M I Hill li H ^hl I 

MSHLSTTVLSIGCmiTIPGSKSSTSSSTSSSTTMKSGSSVYTV^^ 

70 80 90 100 110 120 130 

543 573 603 633 663 693 723 753 

XmD NSTASQAE2^QPTIENSmSSSNLSSSDSAAKEEIJ^S*IKXWIiaRMDN^ 

II: :|::| :: I : :| 1111 111 I- : : b : : : 

LKVSGTVSSSSSSSKKSNSNKSSSSSSKSSSNKSSSSSSSTGTYKVQLGDSLWKIAl^^ 

150 160 170 180 190 200 210 

SEQ ID 8712 (GBS166) was expressed in Kcoli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 30 (lane 2; MW 13.1kDa). 

The GBS166-His fusion product was purified (Figure 200, laue 10) and used to innnunise mice. The 
resulting antiserum was used for FACS (Figure 315), which confirmed feat the protein is hnmunoaccessible 
on GBS bacteria. 

SEQ ID 4 (GBS15) was expressed in Kcoli as a GST-fijsion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 9 (lane 5; MW 44.8kDa), Figure 63 (lane 5; MW 44.8kDa) and Figure 66 (lane 7; 
MW 45kDa). It was also expressed in E.coli as a His-ftision product SDS-PAGE analysis of total cell 
extract is shown in Figure 10 QmQ 4; MW 223kDa). It was also expressed as GBS15L, with SDS-PAGE 
analysis of total cell extract is shown in Figure 185 (lane 1; MW 50kDa). 

Purified GBS15-GST is shown in Figure 91A, Figure 190 (lane 9), Figure 210 (lane 4) and Figure 245 
(lanes 4 & 5). 

The purified GBS15-GST fusion product was used to immunise mice Gane 1 + 2 products; 20^g/mouse). 
The resulting antiserum was used for Western blot (Figure 91B), FACS (Figure 91C ), and in the in vivo 
passive protection assay (Table HI). These tests confirm that the protein is hnmunoaccessible on GBS 
bacteria and that it is an effective protective immunogen. 

Based on tMs analysis, it was predicted lhat fliese proteins and thdr epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 3 

A DNA sequence (GBSx0091) was identified in S.agalactiae <SEQ ID 303> which encodes the amino acid 
sequence <SEQ ID 304>. Analysis of flris protein sequence reveals the following: 

Possible site: 32 

»> Seems to have do N-termiiial signal sequence 

IMTEX3RAL Likelihood ^ -9.66 Transmembrane 22 - 38 { 15 - 41) 



Pinal Results 

bacterial raeiiibrane — Certainty-0. 48 64 (Affirmative) < suco 
bacterial outside — - Cei±ainty=0- 0000 (Mot Clear) < suco 
bacterial cytoplasm — Certainty^O . 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA72096 OT:Y11213 hypothetical protein [Streptococcus therraophilus] 
Identities = 149/274 (54%) , Positives « 208/274 (75%) , Gaps = 9/274 (3%) 

Query: 23 ELVBIiIiIJ5PGIFSI.IIPKSMP--KLTiaa)FLT!^ 80 
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F + LL GI IIP S+ K++ K KK + YVA+GDSLT+GVGD+++QGGF 
Sbjct: 5 FFIiMFVGILIPIIPSSHQSSKISDKIRSVKKE-KVT^ 63 

Query- 81 VPIiSESIJINRYSyQVTSVNyGVSCOTSCX3ILKF^^ 1^0 

VP+LS++I1 + +++QVT NYG++(3NTS QILKRM I++DL+KA L+TLTVGGNDV 
Sbjct: 64 VPVrfiQAI^FNWCfVTPPWTOI^^ 

Query- 141 lAVIRKELSHLSIiJSPEKPAERyKERITO 200 

+ VI+ +++Ir».+N+F K A Y++RL++I+ AR++N LPIY++G™PFYIOTP++T 
Sbjct: 124 lH\nKDNITNLNVNTPSKAAVDYQK^ ^83 

Query: 201 KMCyraDNWNKATKEVVDASENVYFVP^ 254 

+MQr++DNWN++T+EV +NVYFVP+ND LYKGINGK G+T S + S N 

Sbjct: 184 EMQTIVDNWNRSTEEVSKEYDNVYFVPVNDrj^^ 243 

Query: 255 DJU^FTGDHFHPNNIGYQIMSNAVMEKINETRKNW 288 

DALP DHFHPNN GYQIMS+A++++IN+T+K W 
Sbjct: 244 DALFEEDHFHPNKITGXQIMSimiJailNQTKKEW 277 

A related DNA sequence was identified in S.pyogmes <SEQ ID 305> which encodes the amino acid 
sequ«ttce <SEQ ED 306>. Analysis of fliis protein sequence reveals the following: 

Possible site: 39 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood ^=-12. 05 Transmembrane IB - 34 ( 10 - 37) 

Pinal Results 

bacterial membrane Certainty=0. 5819 (Affirmative) < suco 
bacterial outside --- Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm Certainty^O. 0000 (Not Clear) < suco 

A related sequence was also identified in GAS <SEQ ID 9123> which encodes the amino acid sequence 
<SEQ ID 9124>. Analysis of this protein sequence reveals the following: 

Possible site: 33 
>» Seems to have an uncleavable N-term signal seq 

INTBOIAL Likelihood s-12.05 Transmembrane 12 - 28 

Pinal Results 

bacterial membrane — Certainty=0. 5819 (Affirmative) < suco 

bacterial outside — Certainty^^O. 0000 (Not Clear) < suco 

bacterial cytoplasm Ceartainty^O. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below: 

Identities = 178/282 (63%) , Positives = 218/282 (77%) 

Query: 5 LLLWFVMNKKiaLTGLSFFLVSLIJ^FGIFSLIIPKSNPKLTKKDFLT^ 64 

L LWFVMN + + +G+ FF++SL L+F + ++IIPKSN +L K DFL K+ + + YVA+G 
atjjct: 1 iJlLWFVMNNRHLFSGlFFFVISLCLAPLLLNlIIPKSNSRLiaffiDFI^^ 60 

Query: 65 DSLTEGVGrOT-SQGGFVPLLSESLHNRYSyaVTSVNYGVSGNTSQQIIOTMTTDPQIBim 124 

DSLTEGVOD T QGGFVPLL+ L + V NYGVSG+TSQQIL RM QI+ 
Sbjct: 61 DSLTEGVOTLTHQGGFVraLTNDLSEYFKANVNHQNY^^ 120 

Query: 125 LEKADLLTLTVOGNimAVIRKELSHLSLNSFEKPAEAYKERLK^ 184 
L+KAD++TLTVGGNDV+AVIRK L+ L ++SP KPA Y++RL++I+ AR+DN LPI+ 

aojct: 121 lkkrdimtltvg^jdvmavi:rknladlq\^ 180 

Query: 185 VLGIYNPFYLNFPQLTKMC?mDNWNKATKEVVDASENVYFVP 244 

+L6IYNPFYLNFP+LT MQ VID+WN TKEW + VYFVPIND LYKGING+EGI 
Sbjct: 181 ILGIYOT»FYLNPPELTDMQKVIDDWNT^^ 240 

Query: 245 SSMSQftSITiroALFTGDHEHPNNIGXQIMSNAVME^^ 286 
SS Q +1 NDAIiFTGDHEHENN GYQIMSNAVMEKI + K 
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Sbjct: 241 SSGDQTTrVNnALFTGDHFHPNNTGYQIM 282 

A related GBS gene <SEQ ID 5> and protein <SEQ ID 6> were also identified. Analysis of this protein 
sequence reveals the following: 

Lipop: Possible site: -1 Crend: 4 
SRCFLG: 0 

MCG: Length of UR: 24 

Peak Value of DR: 3,02 
Net Charge of CR: 3 
Mc6: Discrim Score: 12.27 
GvH: Signal Score (-7.5): -3.44 

Possible site: 22 
>» Seems to have an uncleavable N-term signal seq 
Amino Acid Coingposition: calculated from 1 
ALCM program count: lvalue: -9.66 threshold: 0.0 

INTE^IAL Likelihood ^ -9.66 Transmembrane 12 - 28 ( 5 - 31) 
PERIPHERAL Likeliliood « 1.96 IIB 
modified ALOM score: 2.43 » 
icml HYPID: 7 CFP: 0.486 



*** Reasoning Step; 3 



Final Results 

bacterial membrane — Certainty^O. 4864 (Affirmative) < suco 

bacterial outside — Certainty^=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty^O. 0000 (Not dear) < suco 



The protein has homology with the following sequences in the databases: 

56.0/80,3% over 272aa 

GP| 1850894 1 hypothetical protein Insert characterized 

ORF020O6{367 - 1164 of 1467) , 
GPll850B94|einblcaA72096.l|lyil213(5 - 277 of 280) hypothetical protein {Streptococcus 

thermophilus } 
%Match =30.8 

%Identity =56.0 %Similarity =80.2 

Matches = ISO Mismatches = 49 Conservative Sub.s = 65 

141 171 201 231 261 291 321 351 

AV*RPSANG*IILLKVPKHBKLLKLASPTVVKLIWLITLEKN*^^ 

381 411 435 465 495 525 555 585 

TCLSPFLVSIJ:iLSFGIPSLIIPKSN--PKLTKKDFLTKKVIPLNYV3^^ 

:: 1:: :)! |1: \: | : : 1 11 11 h 1 111 h II I 1 - : 11 1 1 II = 1 I = 0 : :::| 

SPAGETOLFI^FNraiLIFIIPSSHQSSKISDKIRSVKK-EKVTYVAIGD^^ 

10 20 30 40 50 60 70 

615 645 675 705 735 765 795 825 

VTSVNYGVSGNTSCXJIIiKRMTTDPQIEKDLB^ 

11 11I::|||| mill |::|MI MllHlllh W ^^H^OH 1 I hHhU: |1 

TO>HNYQIA<yn?SNQILKRMQ^ 

90 100 110 120 130 140 150 

855 885 915 945 975 1005 1044 
ODNPKLPIYVXXSrXNPFYLNFPQLTKMCyrVIDNWNKATK^^ ESSNS 

::1 llll-lllllllllll-Mll-llll = :MI OIIIIMI liimil 10 : :| 

KK^^KTLPIYIIGIYNPFm^FPEM3™CffIVI^^ 

170 180 190 200 210 220 230 

1074 1104 1134 1164 1194 1224 1254 1284 

QASITNDALFTGDHFHENNIGYQIMSNAVMEKIN^^ 

1 |: iilll lllllll llllll:h:::||:M 1 
QDSL-Nn!U^EDHFHPNNTGYQIMSnAlLKRINQT 

250 260 270 280 
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SEQ ID 6 (GBS103) was expressed in E.coli as a His-fiision product SDS-PAGE analysis of total ceU 
extract is shown in Figure 36 Qmo 4; MW 321cDa). 

The GBS103-His fusion product was purified (Figure 107A; see also Figure 201, lane 9) and used to 
immunise mice Gane 2+3 product; IS.S^ig/mouse). The resulting antiserum was used for Western blot 
5 (Figure 107B), FACS (Figure 107C ) and in tiie in vivo passive protection assay (Table IB). These tests 
confirm tiiat the protein is immunoaccessible on GBS bacteria and that it is an effective protective 
immunogen. 

Based on this analysis, it was predicted that these proteins and their epitopes could be usefiil antigens for 
vaccines or diagnostics. 

10 Example 4 

A DNA sequence (GBSxl316) was identified in S.agalactiae <SEQ ID 3837> which encodes the amino 
acid sequence <SEQ ID 3838>. Analysis of this protein sequence reveals the following: 

Possible site: 23 

»> Seems to have no N- terminal signal sequence 
15 INTEGRAL Likelihood = -4.30 Transmembrane 1058 -1074 (1056 -1075) 

Final Resiilts 

bacterial membrane Certainty^O .2720 (Affirmative) < suco 
bacterial outside CertaintY=0 . 0000 (Not Clear) < suco 
20 bacterial cytoplasm — - Certainty=0 .0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequaice was identified in S.pyogenes. 

A related GBS gene <SEQ ID 7> and protein <SEQ ID 8> were also identified. Analysis of this protein 
25 sequence reveals the following: 

Lipop: Possible site: -1 Crend: 10 
McG: Discrim Score: -13.26 
GvH: Signal Score (-7.5): -5.76 
Possible site: 41 
30 »> Seems to have no N- terminal signal sequence 

ALOM program count: 1 value: -4.30 threshold: 0.0 

INTEGRAL Likelihood = -4.30 Transmembrane 489 - 505 { 487 - 506) 
PERIPHERAL Likelihood = 3.71 97 
modified ALOM score: 1.36 



35 



*** Reasoning Step: 3 



Final Results 

bacterial membrane — Certainty=0 .2720 (Affirmative) < suco 
40 bacterial outside Certainty^O. 0000 {Not Clear) < suco 

bacterial cytoplasm — - Certainty^O .0000 (Not Clear) < suco 

hPXSG TOOtif : 478-482 

45 SEQ ID 8 (GBS195) was ejqxressed in E.coli as a His-ftision product SDS-PAGE analysis of total cell 
extract is shown in Figure 24 (Yme 8). It was also expressed in Kcoli as a GST-fusion product SDS-PAGE 
analysis of total cell extract is shown in Figure 3 1 (lane 5). 

GBS195C was expressed in Kcoli as a GST-fesion product SDS-PAGE analysis of total cell extract is 
shown in Figure 175 (lane 6 & 7; MW 81kDa). 



wo 02/34771 



-47- 



PCT/GBOl/04789 



GBS195L was expressed in Kcoli as a His-fusion product. SDS-PAGE analysis of total cell extract is 
shown in Figure 83 (lane 2; MW 1231d)a). 

GBS195LN was expressed in Kcoli as a His-lusion product. SDS-PAGE analysis of total ceU extract is 
shown in Figure 83 (lane 3; MW 66kDa), 

GBS195-GST was purified as shown in Figure 198, lane 5. GBS195-His was purified as shown in Figure 
222, lane 4-5. GBS195N-His was purified as shown in Figure 222, lane 6-7. 

The GBS195-GST fusion product was purified (Figure 87A) and used to immunise mice Oane 1 product; 
13.6Mg/mouse). The resulting antiserum was used for Western blot (Figure 87B), FACS, and in the in vivo 
passive protection assay (Table III). These tests confirm that the protem is immunoaccessible on GBS 
bacteria and that it is an effective protective immunogen. 

Based on this analysis, it was predicted that these protems and their epitopes could be useM antigens for 
vaccines or diagnostics. 

Example 5 

A DNA sequence (GBSx0002) was identified in S,agalactiae <SEQ ID 4043> which encodes the amino 
acid sequence <SEQ ID 4044>. This protein is predicted to be hpoprotein MtsA. Analysis of this protein 
sequence reveals the following: 

Possible site: 19 

»> Seems to have no N-terminal signal sequence 

Pinal Results ----- 

bacterial cytoplasm Certainty-0 .3361 (Affirmative) < suco 

bacterial membrane — - Certainty=0. 0000 (Not Clear) < suco 

bacterial outside --- Certainty^O.OOOO (Not Clear) < suco 

A related GBS nucleic add sequence <SEQ ID 9403> which encodes amino acid sequence <SEQ ID 9404> 
was also id^tified. 

A related DNA sequence was identified in S.pyogenes <SEQ ID 3177> which encodes the amino acid 
sequence <SEQ ID 3 178>. Analysis of this protein sequence reveals the following: 

Possible site: 13 
Seems to have no N-teacminal signal sequence 

Final Results 

bacterial cytoplasm — - Certainty=0 . 2412 (Affirmative) < suco 
bacterial menibrane Certainty^O. 0000 (Not Clear) < suco 

bacterial outside — Certainty^O. 0000 (Not dear) < suco 

An ahgnment of the GAS and GBS proteins is shown below: 
Identities - 146/168 (86%) , Positives - 161/168 (94%) 

Query: 1 MNLENGIIYSKNIAKQLIAKDE^NKAlTffiK^ 60 

+NLENGIIYSKNIAKQLIAKDPKNK TYEKN AYVMCtEKIDKEMCSKP+AI NKKLI 
Sbjct: 107 LNLEaJGIIYSKNIAKQLIAKDPKNKETYEKNI.^^ 166 

Query: 61 VTSEGCFKYFSKAYGVPSAYIWEINTEEEGTPDQITSLVKKLKQVRPS^ 120 

VTSBGCFKYFSKAYGVPSMIWEINTEEEGTPDQI+SL++KLK ++PSALFVESSVD+RP 
Sbjct: 167 VTSEGCFKYFSKAYGVPSAYIWEINTEEEGTPDQISSLIEKI^ 226 
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Query: 121 MKSVSRESGIPIYAEIFTDSIAiaCGQICTSYYaMMIO«aU)KI^ 168 

M++VS++SGIPIY+EIFTDSIAKKS+ GDSYYaMMKWNLDKI+EGIAK 
Sbjct: 227 MEWSKDSGIPIYSEIFTDSIAKKGKPGDSYHVMMKWSIi^ 274 

SEQ ID 9404 (GBS679) was expressed in E.coli as a His-ftision product SDS-PAGE analysis of total cell 
extract is shown in Figure 164 Gane 7-9; MW 36kDa) and in Figure 188 Gane 8; MW 36kDa). Purified 
protein is shown in Figure 242, lanes 9 & 10. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 6 

A DNA sequence (GBSxOOOS) was identified m S.agalactiae <SEQ ID 8485> which encodes the amino 
acid sequence <SEQ ID 8486>. This protem is predicted to be ATP-binding protein MtsB. Analysis of this 
protein sequence reveals the following: 

Possible site: 55 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — - Certainty^O, 2097 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — - Certainty^O. 0000 (Not Clear) < suco 

A related DNA sequence was identified in S.pyogenes <SEQ ID 8765> which encodes the anaino acid 
sequence <SEQ ID 8766>. Analysis of this protein sequence reveals ttie following: 

Possible site: 29 

»> Seans to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty^O. 192 9 (Affirmative) < suco 

bacterial membrane --- Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below: 

Identities « 143/238 (60%), Positives « 186/238 (78%). Gaps = 2/238 (0%) 

Query: 1 MIISKHLSVSYDNNL-VLEDINLRIEGSGIIQI^ 59 

MI + +L V+YD N US IN+ +B6 I+GI+GMIGAGKST MKA+L L+D G + 
Sbjct: 10 MITTNNLCVTXDGNSNaiBAINVTIEGPSIVGI^^ 69 

Query: 60 GG-DLLPIMtaiVAYVEQKTIMDYQPPITVGECVSI^YKER^ 118 

6 D Ii VAYVEQ++ IDY FPITV ECV+LQ Y + GLP+R+ K +E+V +V+ 
gbjct: 70 DGKDGRKLGffrVAYVEQRSMIDYNFPlTVKECVMCTYSK^^ 129 

Query: 119 QVGLRGFEmPINALSGGQFQRMrj1ARa,VQEAOT 17B 

QVGXi F +RPI +I£GGQFQRMrfJVRai+(^DyiFIiDEPFVGIDS+SE+IIV+Lr^^ 
Sbjct: 130 a\raLBDBGHRPIKSIiS(3GQFQRI^VARCLIQBSDYIFi:J)BPFW 189 

Query: 179 SKAGKLILVVHHDI^KVDHYFDQVIxriNRHLIAGGPIDQZi^^ 236 

AGK IL+VHHDLSKV+HyFD+++ILN+HL+A G + + FT + LS AYG+ ++I1G+ 
Sbjct: 190 KMAGKTILIVHHDI^KVKHYFDKrMriiNKHLVAYGNVC^ 247 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 
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Example 7 

A DNA sequence (GBSx0004) was ideatified in S.agalactiae <SEQ ID 9> which encodes the amino acid 
sequence <SEQ ID 10>. Analysis of this protem sequence reveals the following: 

Possible site: 28 

»> Seems to have an uncleavable N-tem signal seq 

Final Results 

bacterial menibracie Certainty^O. 0000 (Not Clear) < suco 

bacterial ontside CertaintytaO . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O . 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S,pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 8 

A DNA sequence (GBSxOOOS) was identified in S.agalactiae <SEQ ID 11> which encodes the amino acid 
sequence <SEQ ID 12>. This protein is predicted to be integral membrane protein MtsC (znuB). Analysis 
of this protein sequence reveals the following: 

Lipop: Possible site: -1 Crend: 6 
Mc6: Discrim Score: 3.77 
GvH: Signal Score (-7.5): -0.47 

Possible site: 45 
»> Seems to have a cleavable N- terra signal seq. 
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*** Reasonixig Step: 3 

Pinal Results 

bacterial membrane Certainty^O . 5331 (Affirmative) < suco 

bacterial outside Certainty^O. 0000 (Not Clear) < suco 

bacterial cytcqolasm C3ertainty!=0 . 0000 (Not Clear) < suco 

A related DNA sequence was ideatified in S.pyogenes <SEQ ID 13> which encodes the amino acid 
sequence <SEQ ID 14>. Analysis of this protein sequence reveals the following: 

Possible site: 45 
>» Seems to have a cleavable N-term signal seq. 
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Final Results 
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bacterial menibrane Certainty=0 . 5501 (Affirmative) < suco 

bacterial outside — Certainty^ 0.0000 (Not Clear) < suco 
bacterial cytoplasm Cert ainty=0. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below: 

Identities = 224/275 (81%) , Positives = 255/275 (92%) 

Query: 1 MFTKFFBGriTYHFLQNAFITArviGIVAGAVGCFIILRSMSI^^ 60 

M KFFBGL++yHFLQNA ITA+VIGIV4<3AVGCFIIIJlSMSIMGnAISH^ 
Sbjct: 1 MSMKFPEGI14SYHFLC3NMiITAWIGIVSGK 60 

Query: 61 IMXNFFIGAIVFGTJ^IIITYIKEMSVIRGm'AIGIT^^ 120 

IM+NFFIGAI+FGIi+S+IITyiKEMSVIKGITEAIGITFSSFriAL^ 
Sbjct: 61 iri6VNFPIGAIIFa[irASVIITnKENSVIK(a^^ 120 

Query: 121 FHIIiFGNILAVQDeDKYMTIIVGLrVLTLITIP 180 

FHIIiPaTIIAVQDSDK++TI V + VL +I++FFKEIimjTSH)P+LAKSMG++V+ YHJfL 
Sbjct: 121 FHlLFGiaiiAVQDSDKWITIGVSIFVLVVISLFFKEI^ 180 

Query: 181 LMIia^TLVAVTaMQSVGriLXVALLITPAAT&YLYVK^ 240 

LM+IiTLVAVTAMQSVGTILrWAIiIiITO^ SL+ MIi +SS LGA+ASVLGLY+ 

Sbjct: 181 IMVLLTLVAVTAMQSVGTILIVALMT^^ 240 

Query: 241 6YTFNIAAGSSIVLTSTFMFLLAFLFSPKQSLFKK 275 

GYTFN-fAAGSSIVLTS MFL++F SPKQ K+ 
Sbjct: 241 GYTENVAftGSSraiTSAMMPLISFFVSPKQGYLKR 275 

Based on this analysis, it was predicted that these proteins and their epitopes conld be useful antigens for 
vaccines or diagnostics. 

Example 9 

A DNA sequence (GBSxOOOQ was identified in S.agalactiae <SEQ ID 15> which encodes the amino acid 
sequaice <SEQ ID 16>. Analysis of this protem sequence reveals the following: 

Possible site: 38 

>» Seems to have no N-terminal sigiicuL sequence 

Pinal Results 

bacterial cytoplasm CertaintyteO . 1280 (Affirmative) < suco 

bacterial menfcrane Certainty^O . 0000 (Not dear) < suco 

bacterial outside Certainty^O . 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 10 

A DNA sequence (GBSxOOOT) was identified in SMgalactiae <SEQ ID 17> which encodes the amino acid 
sequence <SEQ ID 18>. This protein is predicted to be peptidyl-prolyl cis-trans isomerase 10 (rotamase). 
Analysis of this protein sequence reveals the following: 

Lipop Possible site: 19 Crend: 2 
McG: Discrim Score: 5.27 
GvH : Signal Score (-7.5): -4 . 14 

Possible site: 19 
»> May be a lipoprotein 



